| =pod |
| |
| =head1 NAME |
| |
| llvmc - The LLVM Compiler Driver |
| |
| =head1 SYNOPSIS |
| |
| B<llvmc> [I<options>] [I<filenames>...] |
| |
| =head1 DESCRIPTION |
| |
| The B<llvmc> command is a configurable driver for invoking other |
| LLVM (and non-LLVM) tools in order to compile, optimize and link software |
| for multiple languages. For those familiar with the GNU Compiler |
| Collection's B<gcc> tool, it is very similar. This tool has the |
| following main goals or purposes: |
| |
| =over |
| |
| =item * A Single point of access to the LLVM tool set. |
| |
| =item * Hide the complexities of the LLVM tools through a single interface. |
| |
| =item * Make integration of existing non-LLVM tools simple. |
| |
| =item * Extend the capabilities of minimal front ends. |
| |
| =item * Make the interface for compiling consistent for all languages. |
| |
| =back |
| |
| The tool itself does nothing with a user's program. It merely invokes other |
| tools to get the compilation tasks done. |
| |
| The options supported by B<llvmc> generalize the compilation process and |
| provide a consistent and simple interface for multiple programming languages. |
| This makes it easier for developers to get their software compiled with LLVM. |
| Without B<llvmc>, developers would need to understand how to invoke the |
| front-end compiler, optimizer, assembler, and linker in order to compile their |
| programs. B<llvmc>'s sole mission is to trivialize that process. |
| |
| =head2 Basic Operation |
| |
| B<llvmc> always takes the following basic actions: |
| |
| =over |
| |
| =item * Command line options and filenames are collected. |
| |
| The command line options provide the marching orders to B<llvmc> on what actions |
| it should perform. This is the I<request> the user is making of B<llvmc> and it |
| is interpreted first. |
| |
| =item * Configuration files are read. |
| |
| Based on the options and the suffixes of the filenames presented, a set of |
| configuration files are read to configure the actions B<llvmc> will take |
| (more on this later). |
| |
| =item * Determine actions to take. |
| |
| The tool chain needed to complete the task is determined. This is the primary |
| work of B<llvmc>. It breaks the request specified by the command line options |
| into a set of basic actions to be done: |
| |
| =over |
| |
| =item * Pre-processing: gathering/filtering compiler input |
| |
| =item * Compilation: source language to bytecode conversion |
| |
| =item * Assembly: bytecode to native code conversion |
| |
| =item * Optimization: conversion of bytecode to something that runs faster |
| |
| =item * Linking: combining multiple bytecodes to produce executable program |
| |
| =back |
| |
| =item * Execute actions. |
| |
| The actions determined previously are executed sequentially and then |
| B<llvmc> terminates. |
| |
| =back |
| |
| =head1 OPTIONS |
| |
| =head2 Control Options |
| |
| Control options tell B<llvmc> what to do at a high level. The |
| following control options are defined: |
| |
| =over |
| |
| =item B<-c> or B<--compile> |
| |
| This option specifies that the linking phase is not to be run. All |
| previous phases, if applicable will run. This is generally how a given |
| bytecode file is compiled and optimized for a source language module. |
| |
| =item B<-k> or B<--link> or default |
| |
| This option (or the lack of any control option) specifies that all stages |
| of compilation, optimization, and linking should be attempted. Source files |
| specified on the command line will be compiled and linked with objects and |
| libraries also specified. |
| |
| =item B<-S> or B<--assemble> |
| |
| This option specifies that compilation should end in the creation of |
| an LLVM assembly file that can be later converted to an LLVM object |
| file. |
| |
| =item B<-E> or B<--preprocess> |
| |
| This option specifies that no compilation or linking should be |
| performed. Only pre-processing, if applicable to the language being |
| compiled, is performed. For languages that support it, this will |
| result in the output containing the raw input to the compiler. |
| |
| =back |
| |
| =head2 Optimization Options |
| |
| Optimization with B<llvmc> is based on goals and specified with |
| the following -O options. The specific details of which |
| optimizations run is controlled by the configuration files because |
| each source language will have different needs. |
| |
| =over |
| |
| =item B<-O1> or B<-O0> (default, fast compilation) |
| |
| Only those optimizations that will hasten the compilation (mostly by reducing |
| the output) are applied. In general these are extremely fast and simple |
| optimizations that reduce emitted code size. The goal here is not to make the |
| resulting program fast but to make the compilation fast. If not specified, |
| this is the default level of optimization. |
| |
| =item B<-O2> (basic optimization) |
| |
| This level of optimization specifies a balance between generating good code |
| that will execute reasonably quickly and not spending too much time optimizing |
| the code to get there. For example, this level of optimization may include |
| things like global common subexpression elimination, aggressive dead code |
| elimination, and scalar replication. |
| |
| =item B<-O3> (aggressive optimization) |
| |
| This level of optimization aggressively optimizes each set of files compiled |
| together. However, no link-time inter-procedural optimization is performed. |
| This level implies all the optimizations of the B<-O1> and B<-O2> optimization |
| levels, and should also provide loop optimizations and compile time |
| inter-procedural optimizations. Essentially, this level tries to do as much |
| as it can with the input it is given but doesn't do any link time IPO. |
| |
| =item B<-O4> (link time optimization) |
| |
| In addition to the previous three levels of optimization, this level of |
| optimization aggressively optimizes each program at link time. It employs |
| basic analysis and basic link-time inter-procedural optimizations, |
| considering the program as a whole. |
| |
| =item B<-O5> (aggressive link time optimization) |
| |
| This is the same as B<-O4> except it employs aggressive analyses and |
| aggressive inter-procedural optimization. |
| |
| =item B<-O6> (profile guided optimization: not implemented) |
| |
| This is the same as B<-O5> except that it employs profile-guided |
| re-optimization of the program after it has executed. Note that this implies |
| a single level of re-optimization based on runtime profile analysis. Once |
| the re-optimization has completed, the profiling instrumentation is |
| removed and final optimizations are employed. |
| |
| =item B<-O7> (lifelong optimization: not implemented) |
| |
| This is the same as B<-O5> and similar to B<-O6> except that re-optimization |
| is performed through the life of the program. That is, each run will update |
| the profile by which future re-optimizations are directed. |
| |
| =back |
| |
| =head2 Input Options |
| |
| =over |
| |
| =item B<-l> I<LIBRARY> |
| |
| This option instructs B<llvmc> to locate a library named I<LIBRARY> and search |
| it for unresolved symbols when linking the program. |
| |
| =item B<-L> F<path> |
| |
| This option instructs B<llvmc> to add F<path> to the list of places in which |
| the linker will |
| |
| =item B<-x> I<LANGUAGE> |
| |
| This option instructs B<llvmc> to regard the following input files as |
| containing programs in the language I<LANGUAGE>. Normally, input file languages |
| are identified by their suffix but this option will override that default |
| behavior. The B<-x> option stays in effect until the end of the options or |
| a new B<-x> option is encountered. |
| |
| =back |
| |
| =head2 Output Options |
| |
| =over |
| |
| =item B<-m>I<arch> |
| |
| This option selects the back end code generator to use. The I<arch> portion |
| of the option names the back end to use. |
| |
| =item B<--native> |
| |
| Normally, B<llvmc> produces bytecode files at most stages of compilation. |
| With this option, B<llvmc> will arrange for native object files to be |
| generated with the B<-c> option, native assembly files to be generated |
| with the B<-S> option, and native executables to be generated with the |
| B<--link> option. In the case of the B<-E> option, the output will not |
| differ as there is no I<native> version of pre-processed output. |
| |
| =item B<-o> F<filename> |
| |
| Specify the output file name. The contents of the file depend on other |
| options. |
| |
| =back |
| |
| =head2 Information Options |
| |
| =over |
| |
| =item B<-n> or B<--no-op> |
| |
| This option tells B<llvmc> to do everything but actually execute the |
| resulting tools. In combination with the B<-v> option, this causes B<llvmc> |
| to merely print out what it would have done. |
| |
| =item B<-v> or B<--verbose> |
| |
| This option will cause B<llvmc> to print out (on standard output) each of the |
| actions it takes to accomplish the objective. The output will immediately |
| precede the invocation of other tools. |
| |
| =item B<--stats> |
| |
| Print all statistics gathered during the compilation to the standard error. |
| Note that this option is merely passed through to the sub-tools to do with |
| as they please. |
| |
| =item B<--time-passes> |
| |
| Record the amount of time needed for each optimization pass and print it |
| to standard error. Like B<--stats> this option is just passed through to |
| the sub-tools to do with as they please. |
| |
| =item B<--time-programs> |
| |
| Record the amount of time each program (compilation tool) takes and print |
| it to the standard error. |
| |
| =back |
| |
| =head2 Language Specific Options |
| |
| =over |
| |
| =item B<-T,pp>=I<options> |
| |
| Pass an arbitrary option to the pre-processor. |
| |
| =item B<-T,opt>=I<options> |
| |
| Pass an arbitrary option to the optimizer. |
| |
| =item B<-T,link>=I<options> |
| |
| Pass an arbitrary option to the linker. |
| |
| =item B<-T,asm>=I<options> |
| |
| Pass an arbitrary option to the code generator. |
| |
| =back |
| |
| =head3 C/C++ Specific Options |
| |
| =over |
| |
| =item B<-I>F<path> |
| |
| This option is just passed through to a C or C++ front end compiler to tell it |
| where include files can be found. |
| |
| =back |
| |
| =head2 Miscellaneous Options |
| |
| =over |
| |
| =item B<--help> |
| |
| Print a summary of command line options. |
| |
| =item B<-V> or B<--version> |
| |
| This option will cause B<llvmc> to print out its version number |
| and terminate. |
| |
| =back |
| |
| =head2 Advanced Options |
| |
| You better know what you're doing if you use these options. Improper use |
| of these options can produce drastically wrong results. |
| |
| =over |
| |
| =item B<--show-config> I<[suffixes...]> |
| |
| When this option is given, the only action taken by B<llvmc> is to show its |
| final configuration state in the form of a configuration file. No compilation |
| tasks will be conducted when this option is given; processing will stop once |
| the configuration has been printed. The optional (comma separated) list of |
| suffixes controls what is printed. Without any suffixes, the configuration |
| for all languages is printed. With suffixes, only the languages pertaining |
| to those file suffixes will be printed. The configuration information is |
| printed after all command line options and configuration files have been |
| read and processed. This allows the user to verify that the correct |
| configuration data has been read by B<llvmc>. |
| |
| =item B<--config> :I<section>:I<name>=I<value> |
| |
| This option instructs B<llvmc> to accept I<value> as the value for configuration |
| item I<name> in the section named I<section>. This is a quick way to override |
| a configuration item on the command line without resorting to changing the |
| configuration files. |
| |
| =item B<--config-file> F<dirname> |
| |
| This option tells B<llvmc> to read configuration data from the I<directory> |
| named F<dirname>. Data from such directories will be read in the order |
| specified on the command line after all other standard configuration files have |
| been read. This allows users or groups of users to conveniently create |
| their own configuration directories in addition to the standard ones to which |
| they may not have write access. |
| |
| =item B<--config-only-from> F<dirname> |
| |
| This option tells B<llvmc> to skip the normal processing of configuration |
| files and only configure from the contents of the F<dirname> directory. Multiple |
| B<--config-only-from> options may be given in which case the directories are |
| read in the order given on the command line. |
| |
| |
| =item B<--emit-raw-code> |
| |
| No optimization is done whatsoever. The compilers invoked by B<llvmc> with |
| this option given will be instructed to produce raw, unoptimized code. This |
| option is useful only to front end language developers and therefore does not |
| participate in the list of B<-O> options. This is distinctly different from |
| the B<-O0> option (a synonym for B<-O1>) because those optimizations will |
| reduce code size to make compilation faster. With B<--emit-raw-code>, only |
| the full raw code produced by the compiler will be generated. |
| |
| =back |
| |
| =head1 CONFIGURATION |
| |
| =head2 Warning |
| |
| Configuration information is relatively static for a given release of LLVM and |
| a front end compiler. However, the details may change from release to release. |
| Users are encouraged to simply use the various options of the B<llvmc> command |
| and ignore the configuration of the tool. These configuration files are for |
| compiler writers and LLVM developers. Those wishing to simply use B<llvmc> |
| don't need to understand this section but it may be instructive on what the tool |
| does. |
| |
| =head2 Introduction |
| |
| B<llvmc> is highly configurable both on the command line and in configuration |
| files. The options it understands are generic, consistent and simple by design. |
| Furthermore, the B<llvmc> options apply to the compilation of any LLVM enabled |
| programming language. To be enabled as a supported source language compiler, a |
| compiler writer must provide a configuration file that tells B<llvmc> how to |
| invoke the compiler and what its capabilities are. The purpose of the |
| configuration files then is to allow compiler writers to specify to B<llvmc> how |
| the compiler should be invoked. Users may but are not advised to alter the |
| compiler's B<llvmc> configuration. |
| |
| Because B<llvmc> just invokes other programs, it must deal with the |
| available command line options for those programs regardless of whether they |
| were written for LLVM or not. Furthermore, not all compilation front ends will |
| have the same capabilities. Some front ends will simply generate LLVM assembly |
| code, others will be able to generate fully optimized byte code. In general, |
| B<llvmc> doesn't make any assumptions about the capabilities or command line |
| options of a sub-tool. It simply uses the details found in the configuration |
| files and leaves it to the compiler writer to specify the configuration |
| correctly. |
| |
| This approach means that new compiler front ends can be up and working very |
| quickly. As a first cut, a front end can simply compile its source to raw |
| (unoptimized) bytecode or LLVM assembly and B<llvmc> can be configured to pick |
| up the slack (translate LLVM assembly to bytecode, optimize the bytecode, |
| generate native assembly, link, etc.). In fact, the front end need not use |
| any LLVM libraries, and it could be written in any language (instead of C++). |
| The configuration data will allow the full range of optimization, assembly, |
| and linking capabilities that LLVM provides to be added to these kinds of tools. |
| Enabling the rapid development of front-ends is one of the primary goals of |
| B<llvmc>. |
| |
| As a compiler front end matures, it may utilize the LLVM libraries and tools to |
| more efficiently produce optimized bytecode directly in a single compilation and |
| optimization program. In these cases, multiple tools would not be needed and |
| the configuration data for the compiler would change. |
| |
| Configuring B<llvmc> to the needs and capabilities of a source language compiler |
| is relatively straight forward. The compilation process is broken down into five |
| phases: |
| |
| =over |
| |
| =item * Pre-processing (filter and combine source files) |
| |
| =item * Translation (translate source language to LLVM assembly or bytecode) |
| |
| =item * Optimization (make bytecode execute quickly) |
| |
| =item * Assembly (converting bytecode to object code) |
| |
| =item * Linking (converting translated code to an executable) |
| |
| =back |
| |
| A compiler writer must provide a definition of what to do for each of these five |
| phases for each of the optimization levels. The specification consists simply of |
| prototypical command lines into which B<llvmc> can substitute command line |
| arguments and file names. Note that any given phase can be completely blank if |
| the source language's compiler combines multiple phases into a single program. |
| For example, quite often pre-processing, translation, and optimization are |
| combined into a single program. The specification for such a compiler would have |
| blank entries for pre-processing and translation but a full command line for |
| optimization. |
| |
| =head2 Configuration File Types |
| |
| There are two types of configuration files: the master configuration file |
| and the language specific configuration file. |
| |
| The master configuration file contains the general configuration of B<llvmc> |
| itself. This includes things like the mapping between file extensions and |
| source languages. This mapping is needed in order to quickly read only the |
| applicable language-specific configuration files (avoiding reading every |
| configuration file for every compilation task). |
| |
| Language specific configuration files tell B<llvmc> how to invoke the language's |
| compiler for a variety of different tasks and what other tools are needed to |
| I<backfill> the compiler's missing features (e.g. optimization). |
| |
| Language specific configuration files are placed in directories and given |
| specific names to foster faster lookup. The name of a given configuration file |
| is the name of the source language. |
| |
| =head2 Default Directory Locations |
| |
| B<llvmc> will look for configuration files in two standard locations: the |
| LLVM installation directory (typically C</usr/local/llvm/etc>) and the user's |
| home directory (typically C</home/user/.llvm>). In these directories a file |
| named C<master> provides the master configuration for B<llvmc>. Language |
| specific files will have a language specific name (e.g. C++, Stacker, Scheme, |
| FORTRAN). When reading the configuration files, the master files are always |
| read first in the following order: |
| |
| =over |
| |
| =item 1 C<master> in LLVM installation directory |
| |
| =item 2 C<master> in the user's home directory. |
| |
| =back |
| |
| Then, based on the command line options and the suffixes of the file names |
| provided on B<llvmc>'s command line, one or more language specific configuration |
| files are read. Only the language specific configuration files actually needed |
| to complete B<llvmc>'s task are read. Other language specific files will be |
| ignored. |
| |
| Note that the user can affect this process in several ways using the various |
| B<--config-*> options and with the B<--x LANGUAGE> option. |
| |
| Although a user I<can> override the master configuration file, this is not |
| advised. The capability is retained so that compiler writers can affect the |
| master configuration (such as adding new file suffixes) while developing a new |
| compiler front end since they might not have write access to the installed |
| master configuration. |
| |
| =head2 Syntax |
| |
| The syntax of the configuration files is yet to be determined. There are three |
| viable options: |
| |
| =over |
| |
| =item XML |
| |
| =item Windows .ini |
| |
| =item specific to B<llvmc> |
| |
| =back |
| |
| =head2 Master Configuration Items |
| |
| =head3 Section: [lang=I<LANGUAGE>] |
| |
| This section provides the master configuration data for a given language. The |
| language specific data will be found in a file named I<LANGUAGE>. |
| |
| =over |
| |
| =item C<suffix=>I<suffix> |
| |
| This adds the I<suffix> specified to the list of recognized suffixes for |
| the I<LANGUAGE> identified in the section. As many suffixes as are commonly used |
| for source files for the I<LANGUAGE> should be specified. |
| |
| =back |
| |
| =begin html |
| |
| <p>For example, the following might appear for C++: |
| <pre><tt> |
| [lang=C++] |
| suffix=.cpp |
| suffix=.cxx |
| suffix=.C |
| </tt></pre></p> |
| |
| =end html |
| |
| =head2 Language Specific Configuration Items |
| |
| =head3 Section: [general] |
| |
| =over |
| |
| =item C<hasPreProcessor=yes|no> |
| |
| This item specifies whether the language has a pre-processing phase or not. This |
| controls whether the B<-E> option works for the language or not. |
| |
| =item C<output=bc|ll> |
| |
| This item specifies the kind of output the language's compiler generates. The |
| choices are either bytecode (C<bc>) or LLVM assembly (C<ll>). |
| |
| =back |
| |
| =head3 Section: [-O0] |
| |
| =over |
| |
| =item C<preprocess=>I<commandline> |
| |
| This item specifies the I<commandline> to use for pre-processing the input. |
| |
| =over |
| |
| Valid substitutions for this item are: |
| |
| =item %in% |
| |
| The input source file. |
| |
| =item %out% |
| |
| The output file. |
| |
| =item %options% |
| |
| Any pre-processing specific options (e.g. B<-I>). |
| |
| =back |
| |
| =item C<translate=>I<commandline> |
| |
| This item specifies the I<commandline> to use for translating the source |
| language input into the output format given by the C<output> item. |
| |
| =item C<optimize=>I<commandline> |
| |
| This item specifies the I<commandline> for optimizing the translator's output. |
| |
| =back |
| |
| =head1 EXIT STATUS |
| |
| If B<llvmc> succeeds, it will exit with 0. Otherwise, if an error |
| occurs, it will exit with a non-zero value and no compilation actions |
| will be taken. If one of the compilation tools returns a non-zero |
| status, pending actions will be discarded and B<llvmc> will return the |
| same result code as the failing compilation tool. |
| |
| =head1 SEE ALSO |
| |
| L<gccas|gccas>, L<gccld|gccld>, L<llvm-as|llvm-as>, L<llvm-dis|llvm-dis>, |
| L<llc|llc>, L<llvm-link|llvm-link> |
| |
| =head1 AUTHORS |
| |
| Reid Spencer, L<rspencer@x10sys.com> |
| |
| =cut |