blob: d7b3de5662e440200b62898efe6507901329a5da [file] [log] [blame]
Reid Spencerb1254a12004-08-09 03:08:29 +00001<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01//EN" "http://www.w3.org/TR/html4/strict.dtd">
2<html>
3<head>
4 <meta http-equiv="Content-Type" content="text/html; charset=utf-8">
5 <title>The LLVM Compiler Driver (llvmc)</title>
6 <link rel="stylesheet" href="llvm.css" type="text/css">
7 <style type="text/css">
8 TR, TD { border: 2px solid gray; padding: 4pt 4pt 2pt 2pt; }
9 TH { border: 2px solid gray; font-weight: bold; font-size: 105%; }
10 TABLE { text-align: center; border: 2px solid black;
11 border-collapse: collapse; margin-top: 1em; margin-left: 1em;
12 margin-right: 1em; margin-bottom: 1em; }
13 .td_left { border: 2px solid gray; text-align: left; }
14 </style>
Reid Spenceraaa3da92004-08-17 09:18:37 +000015 <meta name="author" content="Reid Spencer">
Reid Spencerb1254a12004-08-09 03:08:29 +000016 <meta name="description"
17 content="A description of the use and design of the LLVM Compiler Driver.">
18</head>
19<body>
20<div class="doc_title">The LLVM Compiler Driver (llvmc)</div>
21<p class="doc_warning">NOTE: This document is a work in progress!</p>
22<ol>
23 <li><a href="#abstract">Abstract</a></li>
24 <li><a href="#introduction">Introduction</a>
25 <ol>
26 <li><a href="#purpose">Purpose</a></li>
27 <li><a href="#operation">Operation</a></li>
28 <li><a href="#phases">Phases</a></li>
29 <li><a href="#actions">Actions</a></li>
30 </ol>
31 </li>
Reid Spencerb1254a12004-08-09 03:08:29 +000032 <li><a href="#configuration">Configuration</a>
Reid Spencereefdae52004-08-21 22:37:42 +000033 <ol>
34 <li><a href="#overview">Overview</a></li>
35 <li><a href="#filetypes">Configuration Files</a></li>
36 <li><a href="#syntax">Syntax</a></li>
37 <li><a href="#substitutions">Substitutions</a></li>
38 <li><a href="#sample">Sample Config File</a></li>
39 </ol>
Reid Spencerb1254a12004-08-09 03:08:29 +000040 <li><a href="#glossary">Glossary</a>
41</ol>
42<div class="doc_author">
43<p>Written by <a href="mailto:rspencer@x10sys.com">Reid Spencer</a>
44</p>
45</div>
46
47<!-- *********************************************************************** -->
48<div class="doc_section"> <a name="abstract">Abstract</a></div>
49<!-- *********************************************************************** -->
50<div class="doc_text">
51 <p>This document describes the requirements, design, and configuration of the
52 LLVM compiler driver, <tt>llvmc</tt>. The compiler driver knows about LLVM's
53 tool set and can be configured to know about a variety of compilers for
54 source languages. It uses this knowledge to execute the tools necessary
55 to accomplish general compilation, optimization, and linking tasks. The main
56 purpose of <tt>llvmc</tt> is to provide a simple and consistent interface to
57 all compilation tasks. This reduces the burden on the end user who can just
58 learn to use <tt>llvmc</tt> instead of the entire LLVM tool set and all the
59 source language compilers compatible with LLVM.</p>
60</div>
61<!-- *********************************************************************** -->
62<div class="doc_section"> <a name="introduction">Introduction</a></div>
63<!-- *********************************************************************** -->
64<div class="doc_text">
65 <p>The <tt>llvmc</tt> <a href="def_tool">tool</a> is a configurable compiler
66 <a href="def_driver">driver</a>. As such, it isn't the compiler, optimizer,
67 or linker itself but it drives (invokes) other software that perform those
68 tasks. If you are familiar with the GNU Compiler Collection's <tt>gcc</tt>
69 tool, <tt>llvmc</tt> is very similar.</p>
70 <p>The following introductory sections will help you understand why this tool
71 is necessary and what it does.</p>
72</div>
73
74<!-- _______________________________________________________________________ -->
75<div class="doc_subsection"><a name="purpose">Purpose</a></div>
76<div class="doc_text">
77 <p><tt>llvmc</tt> was invented to make compilation with LLVM based compilers
78 easier. To accomplish this, <tt>llvmc</tt> strives to:</p>
79 <ul>
80 <li>Be the single point of access to most of the LLVM tool set.</li>
81 <li>Hide the complexities of the LLVM tools through a single interface.</li>
82 <li>Provide a consistent interface for compiling all languages.</li>
83 </ul>
84 <p>Additionally, <tt>llvmc</tt> makes it easier to write a compiler for use
85 with LLVM, because it:</p>
86 <ul>
87 <li>Makes integration of existing non-LLVM tools simple.</li>
88 <li>Extends the capabilities of minimal front ends by optimizing their
89 output.</li>
90 <li>Reduces the number of interfaces a compiler writer must know about
91 before a working compiler can be completed (essentially only the VMCore
92 interfaces need to be understood).</li>
93 <li>Supports source language translator invocation via both dynamically
94 loadable shared objects and invocation of an executable.</li>
Reid Spenceraaa3da92004-08-17 09:18:37 +000095 </ul>
Reid Spencerb1254a12004-08-09 03:08:29 +000096</div>
97
98<!-- _______________________________________________________________________ -->
99<div class="doc_subsection"><a name="operation">Operation</a></div>
100<div class="doc_text">
101 <p>At a high level, <tt>llvmc</tt> operation is very simple. The basic action
102 taken by <tt>llvmc</tt> is to simply invoke some tool or set of tools to fill
103 the user's request for compilation. Every execution of <tt>llvmc</tt>takes the
Reid Spenceraaa3da92004-08-17 09:18:37 +0000104 following sequence of steps:</p>
Reid Spencerb1254a12004-08-09 03:08:29 +0000105 <dl>
106 <dt><b>Collect Command Line Options</b></dt>
107 <dd>The command line options provide the marching orders to <tt>llvmc</tt>
108 on what actions it should perform. This is the request the user is making
109 of <tt>llvmc</tt> and it is interpreted first. See the <tt>llvmc</tt>
110 <a href="CommandGuide/html/llvmc.html">manual page</a> for details on the
111 options.</dd>
112 <dt><b>Read Configuration Files</b></dt>
113 <dd>Based on the options and the suffixes of the filenames presented, a set
114 of configuration files are read to configure the actions <tt>llvmc</tt> will
115 take. Configuration files are provided by either LLVM or the front end
Reid Spenceraaa3da92004-08-17 09:18:37 +0000116 compiler tools that <tt>llvmc</tt> invokes. These files determine what
117 actions <tt>llvmc</tt> will take in response to the user's request. See
118 the section on <a href="#configuration">configuration</a> for more details.
119 </dd>
Reid Spencerb1254a12004-08-09 03:08:29 +0000120 <dt><b>Determine Phases To Execute</b></dt>
121 <dd>Based on the command line options and configuration files,
122 <tt>llvmc</tt> determines the compilation <a href="#phases">phases</a> that
123 must be executed by the user's request. This is the primary work of
124 <tt>llvmc</tt>.</dd>
125 <dt><b>Determine Actions To Execute</b></dt>
126 <dd>Each <a href="#phases">phase</a> to be executed can result in the
127 invocation of one or more <a href="#actions">actions</a>. An action is
128 either a whole program or a function in a dynamically linked shared library.
129 In this step, <tt>llvmc</tt> determines the sequence of actions that must be
130 executed. Actions will always be executed in a deterministic order.</dd>
131 <dt><b>Execute Actions</b></dt>
132 <dd>The <a href="#actions">actions</a> necessary to support the user's
133 original request are executed sequentially and deterministically. All
134 actions result in either the invocation of a whole program to perform the
135 action or the loading of a dynamically linkable shared library and invocation
136 of a standard interface function within that library.</dd>
137 <dt><b>Termination</b></dt>
138 <dd>If any action fails (returns a non-zero result code), <tt>llvmc</tt>
139 also fails and returns the result code from the failing action. If
140 everything succeeds, <tt>llvmc</tt> will return a zero result code.</dd>
Reid Spenceraaa3da92004-08-17 09:18:37 +0000141 </dl>
Reid Spencerb1254a12004-08-09 03:08:29 +0000142 <p><tt>llvmc</tt>'s operation must be simple, regular and predictable.
143 Developers need to be able to rely on it to take a consistent approach to
144 compilation. For example, the invocation:</p>
Reid Spenceraaa3da92004-08-17 09:18:37 +0000145 <code>
146 llvmc -O2 x.c y.c z.c -o xyz</code>
Reid Spencerb1254a12004-08-09 03:08:29 +0000147 <p>must produce <i>exactly</i> the same results as:</p>
Reid Spenceraaa3da92004-08-17 09:18:37 +0000148 <code>
149 llvmc -O2 x.c
150 llvmc -O2 y.c
151 llvmc -O2 z.c
152 llvmc -O2 x.o y.o z.o -o xyz</code>
Reid Spencerb1254a12004-08-09 03:08:29 +0000153 <p>To accomplish this, <tt>llvmc</tt> uses a very simple goal oriented
154 procedure to do its work. The overall goal is to produce a functioning
155 executable. To accomplish this, <tt>llvmc</tt> always attempts to execute a
156 series of compilation <a href="#def_phase">phases</a> in the same sequence.
157 However, the user's options to <tt>llvmc</tt> can cause the sequence of phases
158 to start in the middle or finish early.</p>
159</div>
160
161<!-- _______________________________________________________________________ -->
162<div class="doc_subsection"><a name="phases"></a>Phases </div>
163<div class="doc_text">
164 <p><tt>llvmc</tt> breaks every compilation task into the following five
165 distinct phases:</p>
166 <dl><dt><b>Preprocessing</b></dt><dd>Not all languages support preprocessing;
167 but for those that do, this phase can be invoked. This phase is for
168 languages that provide combining, filtering, or otherwise altering with the
169 source language input before the translator parses it. Although C and C++
170 are the most common users of this phase, other languages may provide their
171 own preprocessor (whether its the C pre-processor or not).</dd>
172 </dl>
173 <dl><dt><b>Translation</b></dt><dd>The translation phase converts the source
174 language input into something that LLVM can interpret and use for
175 downstream phases. The translation is essentially from "non-LLVM form" to
176 "LLVM form".</dd>
177 </dl>
178 <dl><dt><b>Optimization</b></dt><dd>Once an LLVM Module has been obtained from
179 the translation phase, the program enters the optimization phase. This phase
180 attempts to optimize all of the input provided on the command line according
181 to the options provided.</dd>
182 </dl>
183 <dl><dt><b>Linking</b></dt><dd>The inputs are combined to form a complete
184 program.</dd>
185 </dl>
186 <p>The following table shows the inputs, outputs, and command line options
187 applicabe to each phase.</p>
188 <table>
189 <tr>
190 <th style="width: 10%">Phase</th>
191 <th style="width: 25%">Inputs</th>
192 <th style="width: 25%">Outputs</th>
193 <th style="width: 40%">Options</th>
194 </tr>
195 <tr><td><b>Preprocessing</b></td>
196 <td class="td_left"><ul><li>Source Language File</li></ul></td>
197 <td class="td_left"><ul><li>Source Language File</li></ul></td>
198 <td class="td_left"><dl>
199 <dt><tt>-E</tt></dt>
200 <dd>Stops the compilation after preprocessing</dd>
201 </dl></td>
202 </tr>
203 <tr>
204 <td><b>Translation</b></td>
205 <td class="td_left"><ul>
206 <li>Source Language File</li>
207 </ul></td>
208 <td class="td_left"><ul>
209 <li>LLVM Assembly</li>
210 <li>LLVM Bytecode</li>
211 <li>LLVM C++ IR</li>
212 </ul></td>
213 <td class="td_left"><dl>
214 <dt><tt>-c</tt></dt>
215 <dd>Stops the compilation after translation so that optimization and
216 linking are not done.</dd>
217 <dt><tt>-S</tt></dt>
218 <dd>Stops the compilation before object code is written so that only
219 assembly code remains.</dd>
220 </dl></td>
221 </tr>
222 <tr>
223 <td><b>Optimization</b></td>
224 <td class="td_left"><ul>
225 <li>LLVM Assembly</li>
226 <li>LLVM Bytecode</li>
227 </ul></td>
228 <td class="td_left"><ul>
229 <li>LLVM Bytecode</li>
230 </ul></td>
231 <td class="td_left"><dl>
232 <dt><tt>-Ox</tt>
233 <dd>This group of options affects the amount of optimization
234 performed.</dd>
235 </dl></td>
236 </tr>
237 <tr>
238 <td><b>Linking</b></td>
239 <td class="td_left"><ul>
240 <li>LLVM Bytecode</li>
241 <li>Native Object Code</li>
242 <li>LLVM Library</li>
243 <li>Native Library</li>
244 </ul></td>
245 <td class="td_left"><ul>
246 <li>LLVM Bytecode Executable</li>
247 <li>Native Executable</li>
248 </ul></td>
249 <td class="td_left"><dl>
250 <dt><tt>-L</tt></dt><dd>Specifies a path for library search.</dd>
251 <dt><tt>-l</tt></dt><dd>Specifies a library to link in.</dd>
252 </dl></td>
253 </tr>
254 </table>
255</div>
256
257<!-- _______________________________________________________________________ -->
258<div class="doc_subsection"><a name="actions"></a>Actions</div>
259<div class="doc_text">
260 <p>An action, with regard to <tt>llvmc</tt> is a basic operation that it takes
261 in order to fulfill the user's request. Each phase of compilation will invoke
262 zero or more actions in order to accomplish that phase.</p>
Reid Spenceraaa3da92004-08-17 09:18:37 +0000263 <p>Actions come in two forms:</p>
264 <ul>
Reid Spencerb1254a12004-08-09 03:08:29 +0000265 <li>Invokable Executables</li>
266 <li>Functions in a shared library</li>
Reid Spenceraaa3da92004-08-17 09:18:37 +0000267 </ul>
Reid Spencerb1254a12004-08-09 03:08:29 +0000268</div>
269
270<!-- *********************************************************************** -->
Reid Spencerb1254a12004-08-09 03:08:29 +0000271<div class="doc_section"><a name="configuration">Configuration</a></div>
272<!-- *********************************************************************** -->
273<div class="doc_text">
274 <p>This section of the document describes the configuration files used by
275 <tt>llvmc</tt>. Configuration information is relatively static for a
276 given release of LLVM and a front end compiler. However, the details may
277 change from release to release of either. Users are encouraged to simply use
Reid Spenceraaa3da92004-08-17 09:18:37 +0000278 the various options of the <tt>llvmc</tt> command and ignore the configuration
279 of the tool. These configuration files are for compiler writers and LLVM
280 developers. Those wishing to simply use <tt>llvmc</tt> don't need to understand
Reid Spencerb1254a12004-08-09 03:08:29 +0000281 this section but it may be instructive on how the tool works.</p>
282</div>
283
284<!-- _______________________________________________________________________ -->
285<div class="doc_subsection"><a name="overview"></a>Overview</div>
286<div class="doc_text">
287<p><tt>llvmc</tt> is highly configurable both on the command line and in
288configuration files. The options it understands are generic, consistent and
289simple by design. Furthermore, the <tt>llvmc</tt> options apply to the
290compilation of any LLVM enabled programming language. To be enabled as a
291supported source language compiler, a compiler writer must provide a
292configuration file that tells <tt>llvmc</tt> how to invoke the compiler
293and what its capabilities are. The purpose of the configuration files then
294is to allow compiler writers to specify to <tt>llvmc</tt> how the compiler
295should be invoked. Users may but are not advised to alter the compiler's
296<tt>llvmc</tt> configuration.</p>
297
298<p>Because <tt>llvmc</tt> just invokes other programs, it must deal with the
299available command line options for those programs regardless of whether they
300were written for LLVM or not. Furthermore, not all compilation front ends will
301have the same capabilities. Some front ends will simply generate LLVM assembly
302code, others will be able to generate fully optimized byte code. In general,
303<tt>llvmc</tt> doesn't make any assumptions about the capabilities or command
Reid Spenceraaa3da92004-08-17 09:18:37 +0000304line options of a sub-tool. It simply uses the details found in the
305configuration files and leaves it to the compiler writer to specify the
306configuration correctly.</p>
Reid Spencerb1254a12004-08-09 03:08:29 +0000307
308<p>This approach means that new compiler front ends can be up and working very
309quickly. As a first cut, a front end can simply compile its source to raw
310(unoptimized) bytecode or LLVM assembly and <tt>llvmc</tt> can be configured
311to pick up the slack (translate LLVM assembly to bytecode, optimize the
312bytecode, generate native assembly, link, etc.). In fact, the front end need
313not use any LLVM libraries, and it could be written in any language (instead of
314C++). The configuration data will allow the full range of optimization,
315assembly, and linking capabilities that LLVM provides to be added to these kinds
316of tools. Enabling the rapid development of front-ends is one of the primary
317goals of <tt>llvmc</tt>.</p>
318
319<p>As a compiler front end matures, it may utilize the LLVM libraries and tools
320to more efficiently produce optimized bytecode directly in a single compilation
321and optimization program. In these cases, multiple tools would not be needed
322and the configuration data for the compiler would change.</p>
323
324<p>Configuring <tt>llvmc</tt> to the needs and capabilities of a source language
325compiler is relatively straight forward. A compiler writer must provide a
326definition of what to do for each of the five compilation phases for each of
327the optimization levels. The specification consists simply of prototypical
328command lines into which <tt>llvmc</tt> can substitute command line
329arguments and file names. Note that any given phase can be completely blank if
330the source language's compiler combines multiple phases into a single program.
331For example, quite often pre-processing, translation, and optimization are
332combined into a single program. The specification for such a compiler would have
333blank entries for pre-processing and translation but a full command line for
334optimization.</p>
335</div>
336
337<!-- _______________________________________________________________________ -->
338<div class="doc_subsection"><a name="filetypes"></a>Configuration Files</div>
339<div class="doc_text">
Reid Spenceraaa3da92004-08-17 09:18:37 +0000340 <h3>File Contents</h3>
341 <p>Each configuration file provides the details for a single source language
342 that is to be compiled. This configuration information tells <tt>llvmc</tt>
343 how to invoke the language's pre-processor, translator, optimizer, assembler
344 and linker. Note that a given source language needn't provide all these tools
345 as many of them exist in llvm currently.</p>
Reid Spencerb1254a12004-08-09 03:08:29 +0000346
347 <h3>Directory Search</h3>
348 <p><tt>llvmc</tt> always looks for files of a specific name. It uses the
349 first file with the name its looking for by searching directories in the
350 following order:<br/>
351 <ol>
352 <li>Any directory specified by the <tt>--config-dir</tt> option will be
353 checked first.</li>
354 <li>If the environment variable LLVM_CONFIG_DIR is set, and it contains
355 the name of a valid directory, that directory will be searched next.</li>
356 <li>If the user's home directory (typically <tt>/home/user</tt> contains
357 a sub-directory named <tt>.llvm</tt> and that directory contains a
358 sub-directory named <tt>etc</tt> then that directory will be tried
359 next.</li>
360 <li>If the LLVM installation directory (typically <tt>/usr/local/llvm</tt>
361 contains a sub-directory named <tt>etc</tt> then that directory will be
362 tried last.</li>
Reid Spencereefdae52004-08-21 22:37:42 +0000363 <li>A standard "system" directory will be searched next. This is typically
364 <tt>/etc/llvm</tt> on UNIX&trade; and <tt>C:\WINNT</tt> on Microsoft
365 Windows&trade;.</li>
Reid Spencerb1254a12004-08-09 03:08:29 +0000366 <li>If the configuration file sought still can't be found, <tt>llvmc</tt>
367 will print an error message and exit.</li>
368 </ol>
Reid Spenceraaa3da92004-08-17 09:18:37 +0000369 <p>The first file found in this search will be used. Other files with the
370 same name will be ignored even if they exist in one of the subsequent search
Reid Spencerb1254a12004-08-09 03:08:29 +0000371 locations.</p>
372
373 <h3>File Names</h3>
Reid Spenceraaa3da92004-08-17 09:18:37 +0000374 <p>In the directories searched, each configuration file is given a specific
375 name to foster faster lookup (so llvmc doesn't have to do directory searches).
376 The name of a given language specific configuration file is simply the same
377 as the suffix used to identify files containing source in that language.
378 For example, a configuration file for C++ source might be named
379 <tt>cpp</tt>, <tt>C</tt>, or <tt>cxx</tt>. For languages that support multiple
380 file suffixes, multiple (probably identical) files (or symbolic links) will
381 need to be provided.</p>
Reid Spencerb1254a12004-08-09 03:08:29 +0000382
383 <h3>What Gets Read</h3>
Reid Spenceraaa3da92004-08-17 09:18:37 +0000384 <p>Which configuration files are read depends on the command line options and
385 the suffixes of the file names provided on <tt>llvmc</tt>'s command line. Note
Reid Spencerb1254a12004-08-09 03:08:29 +0000386 that the <tt>--x LANGUAGE</tt> option alters the language that <tt>llvmc</tt>
Reid Spenceraaa3da92004-08-17 09:18:37 +0000387 uses for the subsequent files on the command line. Only the configuration
388 files actually needed to complete <tt>llvmc</tt>'s task are read. Other
389 language specific files will be ignored.</p>
Reid Spencerb1254a12004-08-09 03:08:29 +0000390</div>
391
392<!-- _______________________________________________________________________ -->
393<div class="doc_subsection"><a name="syntax"></a>Syntax</div>
394<div class="doc_text">
Reid Spenceraaa3da92004-08-17 09:18:37 +0000395 <p>The syntax of the configuration files is very simple and somewhat
396 compatible with Java's property files. Here are the syntax rules:</p>
Reid Spencerb1254a12004-08-09 03:08:29 +0000397 <ul>
Reid Spenceraaa3da92004-08-17 09:18:37 +0000398 <li>The file encoding is ASCII.</li>
Reid Spencereefdae52004-08-21 22:37:42 +0000399 <li>The file is line oriented. There should be one configuration definition
400 per line. Lines are terminated by the newline character (0x0A).</li>
401 <li>A backslash (<tt>\</tt>) before a newline causes the newline to be
402 ignored. This is useful for line continuation of long definitions. A
403 backslash anywhere else is recognized as a backslash.</li>
Reid Spenceraaa3da92004-08-17 09:18:37 +0000404 <li>A configuration item consists of a name, an <tt>=</tt> and a value.</li>
405 <li>A name consists of a sequence of identifiers separated by period.</li>
406 <li>An identifier consists of specific keywords made up of only lower case
407 and upper case letters (e.g. <tt>lang.name</tt>).</li>
408 <li>Values come in four flavors: booleans, integers, commands and
409 strings.</li>
410 <li>Valid "false" boolean values are <tt>false False FALSE no No NO
411 off Off</tt> and <tt>OFF</tt>.</li>
412 <li>Valid "true" boolean values are <tt>true True TRUE yes Yes YES
413 on On</tt> and <tt>ON</tt>.</li>
414 <li>Integers are simply sequences of digits.</li>
415 <li>Commands start with a program name and are followed by a sequence of
416 words that are passed to that program as command line arguments. Program
417 arguments that begin and end with the <tt>@</tt> sign will have their value
418 substituted. Program names beginning with <tt>/</tt> are considered to be
419 absolute. Otherwise the <tt>PATH</tt> will be applied to find the program to
420 execute.</li>
421 <li>Strings are composed of multiple sequences of characters from the
422 character class <tt>[-A-Za-z0-9_:%+/\\|,]</tt> separated by white
423 space.</li>
424 <li>White space on a line is folded. Multiple blanks or tabs will be
425 reduced to a single blank.</li>
426 <li>White space before the configuration item's name is ignored.</li>
427 <li>White space on either side of the <tt>=</tt> is ignored.</li>
428 <li>White space in a string value is used to separate the individual
429 components of the string value but otherwise ignored.</li>
430 <li>Comments are introduced by the <tt>#</tt> character. Everything after a
431 <tt>#</tt> and before the end of line is ignored.</li>
432 </ul>
Reid Spencerb1254a12004-08-09 03:08:29 +0000433</div>
434
435<!-- _______________________________________________________________________ -->
Reid Spenceraaa3da92004-08-17 09:18:37 +0000436<div class="doc_subsection"><a name="items">Configuration Items</a></div>
Reid Spencerb1254a12004-08-09 03:08:29 +0000437<div class="doc_text">
Reid Spenceraaa3da92004-08-17 09:18:37 +0000438 <p>The table below provides definitions of the allowed configuration items
439 that may appear in a configuration file. Every item has a default value and
440 does not need to appear in the configuration file. Missing items will have the
441 default value. Each identifier may appear as all lower case, first letter
442 capitalized or all upper case.</p>
Reid Spencera2aa3042004-08-10 16:40:56 +0000443 <table>
Reid Spencereefdae52004-08-21 22:37:42 +0000444 <tbody>
445 <tr>
446 <th>Name</th>
447 <th>Value Type</th>
448 <th>Description</th>
449 <th>Default</th>
450 </tr>
451 <tr><td colspan="4"><h4>LANG ITEMS</h4></td></tr>
452 <tr>
453 <td><b>lang.name</b></td>
454 <td>string</td>
455 <td class="td_left">Provides the common name for a language definition.
456 For example "C++", "Pascal", "FORTRAN", etc.</td>
457 <td><i>blank</i></td>
458 </tr>
459 <tr>
460 <td><b>lang.opt1</b></td>
461 <td>string</td>
462 <td class="td_left">Specifies the parameters to give the optimizer when
463 <tt>-O1</tt> is specified on the <tt>llvmc</tt> command line.</td>
464 <td><tt>-simplifycfg -instcombine -mem2reg</tt></td>
465 </tr>
466 <tr>
467 <td><b>lang.opt2</b></td>
468 <td>string</td>
469 <td class="td_left">Specifies the parameters to give the optimizer when
470 <tt>-O2</tt> is specified on the <tt>llvmc</tt> command line.</td>
471 <td><i>TBD</i></td>
472 </tr>
473 <tr>
474 <td><b>lang.opt3</b></td>
475 <td>string</td>
476 <td class="td_left">Specifies the parameters to give the optimizer when
477 <tt>-O3</tt> is specified on the <tt>llvmc</tt> command line.</td>
478 <td><i>TBD</i></td>
479 </tr>
480 <tr>
481 <td><b>lang.opt4</b></td>
482 <td>string</td>
483 <td class="td_left">Specifies the parameters to give the optimizer when
484 <tt>-O4</tt> is specified on the <tt>llvmc</tt> command line.</td>
485 <td><i>TBD</i></td>
486 </tr>
487 <tr>
488 <td><b>lang.opt5</b></td>
489 <td>string</td>
490 <td class="td_left">Specifies the parameters to give the optimizer when
491 <tt>-O5</tt> is specified on the <tt>llvmc</tt> command line.</td>
492 <td><i>TBD</i></td>
493 </tr>
494 <tr><td colspan="4"><h4>PREPROCESSOR ITEMS</h4></td></tr>
495 <tr>
496 <td><b>preprocessor.command</b></td>
497 <td>command</td>
498 <td class="td_left">This provides the command prototype that will be used
499 to run the preprocessor. This is generally only used with the
500 <tt>-E</tt> option.</td>
501 <td>&lt;blank&gt;</td>
502 </tr>
503 <tr>
504 <td><b>preprocessor.required</b></td>
505 <td>boolean</td>
506 <td class="td_left">This item specifies whether the pre-processing phase
507 is required by the language. If the value is true, then the
508 <tt>preprocessor.command</tt> value must not be blank. With this option,
509 <tt>llvmc</tt> will always run the preprocessor as it assumes that the
510 translation and optimization phases don't know how to pre-process their
511 input.</td>
512 <td>false</td>
513 </tr>
514 <tr><td colspan="4"><h4>TRANSLATOR ITEMS</h4></td></tr>
515 <tr>
516 <td><b>translator.command</b></td>
517 <td>command</td>
518 <td class="td_left">This provides the command prototype that will be used
519 to run the translator. Valid substitutions are <tt>@in@</tt> for the
520 input file and <tt>@out@</tt> for the output file.</td>
521 <td>&lt;blank&gt;</td>
522 </tr>
523 <tr>
524 <td><b>translator.output</b></td>
525 <td><tt>native</tt>, <tt>bytecode</tt> or <tt>assembly</tt></td>
526 <td class="td_left">This item specifies the kind of output the language's
527 translator generates.</td>
528 <td><tt>bytecode</tt></td>
529 </tr>
530 <tr>
531 <td><b>translator.preprocesses</b></td>
532 <td>boolean</td>
533 <td class="td_left">Indicates that the translator also preprocesses. If
534 this is true, then <tt>llvmc</tt> will skip the pre-processing phase
535 whenever the final phase is not pre-processing.</td>
536 <td><tt>false</tt></td>
537 </tr>
538 <tr>
539 <td><b>translator.optimizers</b></td>
540 <td>boolean</td>
541 <td class="td_left">Indicates that the translator also optimizes. If
542 this is true, then <tt>llvmc</tt> will skip the optimization phase
543 whenever the final phase is optimization or later.</td>
544 <td><tt>false</tt></td>
545 </tr>
546 <tr>
547 <td><b>translator.groks_dash_o</b></td>
548 <td>boolean</td>
549 <td class="td_left">Indicates that the translator understands the
550 <i>intent</i> of the various <tt>-O</tt><i>n</i> options to
551 <tt>llvmc</tt>. This will cause the <tt>-O</tt><i>n</i> option to be
552 given to the translator instead of the equivalent options provided by
553 <tt>lang.opt</tt><i>n</i>.</td>
554 <td><tt>false</tt></td>
555 </tr>
556 <tr><td colspan="4"><h4>OPTIMIZER ITEMS</h4></td></tr>
557 <tr>
558 <td><b>optimizer.command</b></td>
559 <td>command</td>
560 <td class="td_left">This provides the command prototype that will be used
561 to run the optimizer. Valid substitutions are <tt>@in@</tt> for the
562 input file and <tt>@out@</tt> for the output file.</td>
563 <td>&lt;blank&gt;</td>
564 </tr>
565 <tr>
566 <td><b>optimizer.output</b></td>
567 <td><tt>native</tt>, <tt>bytecode</tt> or <tt>assembly</tt></td>
568 <td class="td_left">This item specifies the kind of output the language's
569 optimizer generates.</td>
570 <td><tt>bytecode</tt></td>
571 </tr>
572 <tr>
573 <td><b>optimizer.preprocesses</b></td>
574 <td>boolean</td>
575 <td class="td_left">Indicates that the optimizer also preprocesses. If
576 this is true, then <tt>llvmc</tt> will skip the pre-processing phase
577 whenever the final phase is optimization or later.</td>
578 <td><tt>false</tt></td>
579 </tr>
580 <tr>
581 <td><b>optimizer.translates</b></td>
582 <td>boolean</td>
583 <td class="td_left">Indicates that the optimizer also translates. If
584 this is true, then <tt>llvmc</tt> will skip the translation phase
585 whenever the final phase is optimization or later.</td>
586 <td><tt>false</tt></td>
587 </tr>
588 <tr>
589 <td><b>optimizer.groks_dash_o</b></td>
590 <td>boolean</td>
591 <td class="td_left">Indicates that the translator understands the
592 <i>intent</i> of the various <tt>-O</tt><i>n</i> options to
593 <tt>llvmc</tt>. This will cause the <tt>-O</tt><i>n</i> option to be
594 given to the translator instead of the equivalent options provided by
595 <tt>lang.opt</tt><i>n</i>.</td>
596 <td><tt>false</tt></td>
597 </tr>
598 <tr><td colspan="4"><h4>ASSEMBLER ITEMS</h4></td></tr>
599 <tr>
600 <td><b>assembler.command</b></td>
601 <td>command</td>
602 <td class="td_left">This provides the command prototype that will be used
603 to run the assembler. Valid substitutions are <tt>@in@</tt> for the
604 input file and <tt>@out@</tt> for the output file.</td>
605 <td>&lt;blank&gt;</td>
606 </tr>
607 <tr><td colspan="4"><h4>LINKER ITEMS</h4></td></tr>
608 <tr>
609 <td><b>linker.libs</b></td>
610 <td>library names</td>
611 <td class="td_left">This provides the list of runtime libraries that the
612 source language <i>could</i> link with. In general, the libraries
613 needed will be encoded into the LLVM Assembly or bytecode file.
614 However, this list tells <tt>llvmc</tt> the names of the ones that
615 apply to this source language. The names provided here should be
616 unadorned with no suffix and no "lib" prefix.
617 </td>
618 <td>&lt;blank&gt;</td>
619 </tr>
620 <tr>
621 <td><b>linker.lib_paths</b></td>
622 <td>Fully qualifed local path names</td>
623 <td class="td_left">This item provides a list of potential directories
624 in which the source language's runtime libraries might be located. If
625 a given object file compiled with this language's translator is linked
626 then those libraries will be given as <tt>-L</tt> options to the
627 linker.</td>
628 <td><tt>&lt;blank&gt;</tt></td>
629 </tr>
630 <tr>
631 <td><b>linker.output</b></td>
632 <td><tt>native</tt>, <tt>bytecode</tt> or <tt>assembly</tt></td>
633 <td class="td_left">This item specifies the kind of output the language's
634 translator generates.</td>
635 <td><tt>bytecode</tt></td>
636 </tr>
637 </tbody>
Reid Spencera2aa3042004-08-10 16:40:56 +0000638 </table>
Reid Spencerb1254a12004-08-09 03:08:29 +0000639</div>
640
Reid Spencereefdae52004-08-21 22:37:42 +0000641<!-- _______________________________________________________________________ -->
642<div class="doc_subsection"><a name="substitutions">Substitutions</a></div>
643<div class="doc_text">
644 <p>On any configruation item that ends in <tt>command</tt>, you must
645 specify substitution tokens. Substitution tokens begin and end with a percent
646 sign (<tt>%</tt>) and are replaced by the corresponding text. Any substitution
647 token may be given on any <tt>command</tt> line but some are more useful than
648 others. In particular each command <em>should</em> have both an <tt>%in%</tt>
649 and an <tt>%out%</tt> substittution. The table below provides definitions of
650 each of the allowed substitution tokens.</p>
651 <table>
652 <tbody>
653 <tr>
654 <th>Substitution Token</th>
655 <th>Replacement Description</th>
656 </tr>
657 <tr>
658 <td><tt>%args%</tt></td>
659 <td class="td_left">Replaced with all the tool-specific arguments given
660 to <tt>llvmc</tt> via the <tt>-T</tt> set of options. This just allows
661 you to place these arguments in the correct place on the command line.
662 If the %args% option does not appear on your command line, then you
663 are explicitly disallowing the <tt>-T</tt> option for your tool.
664 </td>
665 <tr>
666 <td><tt>%in%</tt></td>
667 <td class="td_left">Replaced with the full path of the input file. You
668 needn't worry about the cascading of file names. <tt>llvmc</tt> will
669 create temporary files and ensure that the output of one phase is the
670 input to the next phase.</td>
671 </tr>
672 <tr>
673 <td><tt>%opt%</tt></td>
674 <td class="td_left">Replaced with the optimization options for the
675 tool. If the tool understands the <tt>-O</tt> options then that will
676 be passed. Otherwise, the <tt>lang.optN</tt> series of configuration
677 items will specify which arguments are to be given.</td>
678 </tr>
679 <tr>
680 <td><tt>%out%</tt></td>
681 <td class="td_left">Replaced with the full path of the output file.
682 Note that this is not necessarily the output file specified with the
683 <tt>-o</tt> option on <tt>llvmc</tt>'s command line. It might be a
684 temporary file that will be passed to a subsequent phase's input.
685 </td>
686 </tr>
687 <tr>
688 <td><tt>%stats%</tt></td>
689 <td class="td_left">If your command accepts the <tt>-stats</tt> option,
690 use this substitution token. If the user requested <tt>-stats</tt>
691 from the <tt>llvmc</tt> command line then this token will be replaced
692 with <tt>-stats</tt>, otherwise it will be ignored.
693 </td>
694 </tr>
695 <tr>
696 <td><tt>%target%</tt></td>
697 <td class="td_left">Replaced with the name of the target "machine" for
698 which code should be generated. The value used here is taken from the
699 <tt>llvmc</tt> option <tt>-march</tt>.
700 </td>
701 </tr>
702 <tr>
703 <td><tt>%time%</tt></td>
704 <td class="td_left">If your command accepts the <tt>-time-passes</tt>
705 option, use this substitution token. If the user requested
706 <tt>-time-passes</tt> from the <tt>llvmc</tt> command line then this
707 token will be replaced with <tt>-time-passes</tt>, otherwise it will
708 be ignored.
709 </td>
710 </tr>
711 </tbody>
712 </table>
713</div>
714
715<!-- _______________________________________________________________________ -->
716<div class="doc_subsection"><a name="sample">Sample Config File</a></div>
717<div class="doc_text">
718 <p>Since an example is always instructive, here's how the Stacker language
719 configuration file looks.</p>
720 <pre><tt>
721# Stacker Configuration File For llvmc
722
723##########################################################
724# Language definitions
725##########################################################
726 lang.name=Stacker
727 lang.opt1=-simplifycfg -instcombine -mem2reg
728 lang.opt2=-simplifycfg -instcombine -mem2reg -load-vn \
729 -gcse -dse -scalarrepl -sccp
730 lang.opt3=-simplifycfg -instcombine -mem2reg -load-vn \
731 -gcse -dse -scalarrepl -sccp -branch-combine -adce \
732 -globaldce -inline -licm -pre
733 lang.opt4=-simplifycfg -instcombine -mem2reg -load-vn \
734 -gcse -dse -scalarrepl -sccp -ipconstprop \
735 -branch-combine -adce -globaldce -inline -licm -pre
736 lang.opt5=-simplifycfg -instcombine -mem2reg --load-vn \
737 -gcse -dse scalarrepl -sccp -ipconstprop \
738 -branch-combine -adce -globaldce -inline -licm -pre \
739 -block-placement
740
741##########################################################
742# Pre-processor definitions
743##########################################################
744
745 # Stacker doesn't have a preprocessor but the following
746 # allows the -E option to be supported
747 preprocessor.command=cp %in% %out%
748 preprocessor.required=false
749
750##########################################################
751# Translator definitions
752##########################################################
753
754 # To compile stacker source, we just run the stacker
755 # compiler with a default stack size of 2048 entries.
756 translator.command=stkrc -s 2048 %in% -o %out% %time% \
757 %stats% %args%
758
759 # stkrc doesn't preprocess but we set this to true so
760 # that we don't run the cp command by default.
761 translator.preprocesses=true
762
763 # The translator is required to run.
764 translator.required=true
765
766 # stkrc doesn't do any optimization, it just translates
767 translator.optimizes=no
768
769 # stkrc doesn't handle the -On options
770 translator.groks_dash_O=no
771
772##########################################################
773# Optimizer definitions
774##########################################################
775
776 # For optimization, we use the LLVM "opt" program
777 optimizer.command=opt %in% -o %out% %opt% %time% %stats% \
778 %args%
779
780 # opt doesn't (yet) grok -On
781 optimizer.groks_dash_O=no
782
783 # opt doesn't translate
784 optimizer.translates = no
785
786 # opt doesn't preprocess
787 optimizer.preprocesses=no
788
789##########################################################
790# Assembler definitions
791##########################################################
792 assembler.command=llc %in% -o %out% %target% \
793 "-regalloc=linearscan" %time% %stats%
794
795##########################################################
796# Linker definitions
797##########################################################
798 linker.libs=stkr_runtime
799 linker.paths=
800</tt></pre>
801
802
Reid Spencerb1254a12004-08-09 03:08:29 +0000803<!-- *********************************************************************** -->
804<div class="doc_section"><a name="glossary">Glossary</a></div>
805<!-- *********************************************************************** -->
806<div class="doc_text">
807 <p>This document uses precise terms in reference to the various artifacts and
808 concepts related to compilation. The terms used throughout this document are
809 defined below.</p>
810 <dl>
811 <dt><a name="def_assembly"><b>assembly</b></a></dt>
812 <dd>A compilation <a href="#def_phase">phase</a> in which LLVM bytecode or
813 LLVM assembly code is assembled to a native code format (either target
814 specific aseembly language or the platform's native object file format).
815 </dd>
816
817 <dt><a name="def_compiler"><b>compiler</b></a></dt>
818 <dd>Refers to any program that can be invoked by <tt>llvmc</tt> to accomplish
819 the work of one or more compilation <a href="#def_phase">phases</a>.</dd>
820
821 <dt><a name="def_driver"><b>driver</b></a></dt>
822 <dd>Refers to <tt>llvmc</tt> itself.</dd>
823
824 <dt><a name="def_linking"><b>linking</b></a></dt>
825 <dd>A compilation <a href="#def_phase">phase</a> in which LLVM bytecode files
826 and (optionally) native system libraries are combined to form a complete
827 executable program.</dd>
828
829 <dt><a name="def_optimization"><b>optimization</b></a></dt>
830 <dd>A compilation <a href="#def_phase">phase</a> in which LLVM bytecode is
831 optimized.</dd>
832
833 <dt><a name="def_phase"><b>phase</b></a></dt>
834 <dd>Refers to any one of the five compilation phases that that
835 <tt>llvmc</tt> supports. The five phases are:
836 <a href="#def_preprocessing">preprocessing</a>,
837 <a href="#def_translation">translation</a>,
838 <a href="#def_optimization">optimization</a>,
839 <a href="#def_assembly">assembly</a>,
840 <a href="#def_linking">linking</a>.</dd>
841
842 <dt><a name="def_sourcelanguage"><b>source language</b></a></dt>
843 <dd>Any common programming language (e.g. C, C++, Java, Stacker, ML,
844 FORTRAN). These languages are distinguished from any of the lower level
845 languages (such as LLVM or native assembly), by the fact that a
846 <a href="#def_translation">translation</a> <a href="#def_phase">phase</a>
847 is required before LLVM can be applied.</dd>
848
849 <dt><a name="def_tool"><b>tool</b></a></dt>
850 <dd>Refers to any program in the LLVM tool set.</dd>
851
852 <dt><a name="def_translation"><b>translation</b></a></dt>
853 <dd>A compilation <a href="#def_phase">phase</a> in which
854 <a href="#def_sourcelanguage">source language</a> code is translated into
855 either LLVM assembly language or LLVM bytecode.</dd>
856 </dl>
857</div>
858<!-- *********************************************************************** -->
859<hr>
860<address> <a href="http://jigsaw.w3.org/css-validator/check/referer"><img
861 src="http://jigsaw.w3.org/css-validator/images/vcss" alt="Valid CSS!"></a><a
862 href="http://validator.w3.org/check/referer"><img
863 src="http://www.w3.org/Icons/valid-html401" alt="Valid HTML 4.01!"></a><a
864 href="mailto:rspencer@x10sys.com">Reid Spencer</a><br>
865<a href="http://llvm.cs.uiuc.edu">The LLVM Compiler Infrastructure</a><br>
866Last modified: $Date$
867</address>
868<!-- vim: sw=2
869-->
870</body>
871</html>