Add first proof-of-concept universal compiler driver framework based
on ideas mentioned in PR686.
Written by Mikhail Glushenkov and contributed by Codedgers, Inc.

Old llvmc will be removed soon after new one will have all its properties.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@48699 91177308-0d34-0410-b5e6-96231b3b80d8
diff --git a/tools/llvmc2/doc/LLVMC-Enhancements.rst b/tools/llvmc2/doc/LLVMC-Enhancements.rst
new file mode 100644
index 0000000..a831ea0
--- /dev/null
+++ b/tools/llvmc2/doc/LLVMC-Enhancements.rst
@@ -0,0 +1,270 @@
+Introduction
+============
+
+A complete rewrite of the LLVMC compiler driver is proposed, aimed at
+making it more configurable and useful.
+
+Motivation
+==========
+
+As it stands, current version of LLVMC does not meet its stated goals
+of configurability and extensibility and is therefore not used
+much. The need for enhancements in LLVMC is also reflected in [1]_. The
+proposed rewrite will fix the aforementioned deficiences and provide
+an extensible, future-proof solution.
+
+Design
+======
+
+A compiler driver's job is essentially to find a way how to transform
+a set of input files into a set of targets, depending on the
+user-provided options. Since several ways of transformation can exist
+potentially, it's natural to use a directed graph to represent all of
+them. In this graph, nodes are tools (for example, ``gcc -S`` is a tool
+that generates assembly from C language files) and edges between them
+mean that the output of one tool can be given as input to another (as
+in ``gcc -S -o - file.c | as``). We'll call this graph the compilation
+graph.
+
+The proposed design revolves around the compilation graph and the
+following core abstractions:
+
+- Target - An (intermediate) compilation target.
+
+- Action - A shell command template that represents basic compilation
+  transformation(example: ``gcc -S $INPUT_FILE -o $OUTPUT_FILE``).
+
+- Tool - Encapsulates information about a concrete tool used in the
+  compilation process, produces Actions. Operation depends on
+  command-line options provided by the user.
+
+- GraphBuilder - Constructs the compilation graph, operation depends
+  on command-line options.
+
+- GraphTraverser - Traverses the compilation graph and constructs a
+  sequence of Actions needed to build the target file, operation
+  depends on command-line options.
+
+A high-level view of the compilation process:
+
+  1. Configuration libraries (see below) are loaded in and the
+  compilation graph is constructed from the tool descriptions.
+
+  2. Information about possible options is gathered from (the nodes of)
+  the compilation graph.
+
+  3. Options are parsed based on data gathered in step 2.
+
+  4. A sequence of Actions needed to build the target is constructed
+  using the compilation graph and provided options.
+
+  5. The resulting action sequence is executed.
+
+Extensibility
+==============
+
+To make this design extensible, TableGen [2]_ will be used for
+automatic generation of the Tool classes. Users wanting to customize
+LLVMC will need to write a configuration library consisting of a set
+of TableGen descriptions of compilation tools plus a number of hooks
+that influence compilation graph construction and traversal. LLVMC
+will have the ability to load user configuration libraries at runtime;
+in fact, its own basic functionality will be implemented as a
+configuration library.
+
+TableGen specification example
+------------------------------
+
+This small example specifies a Tool that converts C source to object
+files. Note that it is only a mock-up of inteded functionality, not a
+final specification::
+
+    def GCC : Tool<
+     GCCProperties, // Properties of this tool
+     GCCOptions     // Options description for this tool
+    >;
+
+    def GCCProperties : ToolProperties<[
+     ToolName<"GCC">,
+     InputLanguageName<"C">,
+     OutputLanguageName<"Object-Code">
+     InputFileExtension<"c">,
+     OutputFileExtension<"o">,
+     CommandFormat<"gcc -c $OPTIONS $FILES">
+    ]>;
+
+    def GCCOptions : ToolOptions<[
+     Option<
+       "-Wall",                 // Option name
+       [None],                  // Allowed values
+       [AddOption<"-Wall">]>,   // Action
+
+     Option<
+       "-Wextra",               // Option name
+       [None],                  // Allowed values
+       [AddOption<"-Wextra">]>, // Action
+
+     Option<
+       "-W",                 // Option name
+       [None],               // Allowed values
+       [AddOption<"-W">]>,   // Action
+
+     Option<
+       "-D",        // Option name
+       [AnyString], // Allowed values
+
+       [AddOptionWithArgument<"-D",GetOptionArgument<"-D">>]
+       // Action:
+       // If the driver was given option "-D<argument>", add
+       // option "-D" with the same argument to the invocation string of
+       // this tool.
+       >
+
+     ]>;
+
+Example of generated code
+-------------------------
+
+The specification above compiles to the following code (again, it's a
+mock-up)::
+
+    class GCC : public Tool {
+
+    public:
+
+      GCC() { //... }
+
+     // Properties
+
+      static const char* ToolName = "GCC";
+      static const char* InputLanguageName = "C";
+      static const char* OutputLanguageName = "Object-Code";
+      static const char* InputFileExtension = "c";
+      static const char* OutputFileExtension = "o";
+      static const char* CommandFormat = "gcc -c $OPTIONS $FILES";
+
+     // Options
+
+     OptionsDescription SupportedOptions() {
+       OptionsDescription supportedOptions;
+
+       supportedOptions.Add(Option("-Wall"));
+       supportedOptions.Add(Option("-Wextra"));
+       supportedOptions.Add(Option("-W"));
+       supportedOptions.Add(Option("-D", AllowedArgs::ANY_STRING));
+
+       return supportedOptions;
+     }
+
+     Action GenerateAction(Options providedOptions) {
+       Action generatedAction(CommandFormat); Option curOpt;
+
+       curOpt = providedOptions.Get("-D");
+       if (curOpt) {
+          assert(curOpt.HasArgument());
+          generatedAction.AddOption(Option("-D", curOpt.GetArgument()));
+       }
+
+       curOpt = providedOptions.Get("-Wall");
+       if (curOpt)
+         generatedAction.AddOption(Option("-Wall"));
+
+       curOpt = providedOptions.Get("-Wextra");
+       if (curOpt)
+         generatedAction.AddOption(Option("-Wall"));
+
+       curOpt = providedOptions.Get("-W");
+       if (curOpt)
+         generatedAction.AddOption(Option("-Wall")); }
+
+       return generatedAction;
+     }
+
+    };
+
+    // defined somewhere...
+
+    class Action { public: void AddOption(const Option& opt) {...}
+    int Run(const Filenames& fnms) {...}
+
+    }
+
+Option handling
+===============
+
+Since one of the main tasks of the compiler driver is to correctly
+handle user-provided options, it is important to define this process
+in exact way. The intent of the proposed scheme is to function as a
+drop-in replacement for GCC.
+
+Option syntax
+-------------
+
+Option syntax is specified by the following formal grammar::
+
+        <command-line>      ::=  <option>*
+        <option>            ::=  <positional-option> | <named-option>
+        <named-option>      ::=  -[-]<option-name>[<delimeter><option-argument>]
+        <delimeter>         ::=  ',' | '=' | ' '
+        <positional-option> ::=  <string>
+        <option-name>       ::=  <string>
+        <option-argument>   ::=  <string>
+
+This roughly corresponds to the GCC option syntax. Note that grouping
+of short options(as in ``ls -la``) is forbidden.
+
+Example::
+
+        llvmc -O3 -Wa,-foo,-bar -pedantic -std=c++0x a.c b.c c.c
+
+Option arguments can also have special forms: for example, an argument
+can be a comma-separated list (like in -Wa,-foo,-bar). In such cases,
+it's up to the option handler to parse the argument.
+
+Option semantics
+----------------
+
+According to their meaning, options are classified into following
+categories:
+
+- Global options - Options that influence compilation graph
+  construction/traversal. Example: -E (stop after preprocessing).
+
+- Local options - Options that influence one or several Actions in
+  the generated action sequence. Example: -O3 (turn on optimization).
+
+- Prefix options - Options that influence meaning of the following
+  command-line arguments. Example: -x language (specify language for
+  the input files explicitly). Prefix options can be local or global.
+
+- Built-in options - Options that are hard-coded into
+  driver. Examples: --help, -o file/-pipe (redirect output). Can be
+  local or global.
+
+Naming
+======
+
+Since the compiler driver, as a single point of access to the LLVM
+tool set, is a very often used tool, it would be desirable to make its name
+as short and easy to type as possible. Some possible names are 'llcc' or
+'lcc', by analogy with gcc.
+
+
+Issues
+======
+
+1. Should global-options-influencing hooks be written by hand or
+   auto-generated from TableGen specifications?
+
+2. More?
+
+References
+==========
+
+.. [1] LLVM Bug#686
+
+       http://llvm.org/bugs/show_bug.cgi?id=686
+
+.. [2] TableGen Fundamentals
+
+       http://llvm.org/docs/TableGenFundamentals.html
diff --git a/tools/llvmc2/doc/LLVMCC-Tutorial.rst b/tools/llvmc2/doc/LLVMCC-Tutorial.rst
new file mode 100644
index 0000000..8374cad
--- /dev/null
+++ b/tools/llvmc2/doc/LLVMCC-Tutorial.rst
@@ -0,0 +1,153 @@
+Tutorial - Writing LLVMCC Configuration files
+=============================================
+
+LLVMCC is a generic compiler driver(just like ``gcc``), designed to be
+customizable and extensible. Its job is essentially to transform a set
+of input files into a set of targets, depending on configuration rules
+and user options. This tutorial describes how one can write
+configuration files for ``llvmcc``.
+
+Since LLVMCC uses TableGen [1]_ as the language of its configuration
+files, you need to be familiar with it.
+
+Describing a toolchain
+----------------------
+
+The main concept that ``llvmcc`` operates with is a *toolchain*, which
+is just a list of tools that process input files in a pipeline-like
+fashion. Toolchain definitions look like this::
+
+   def ToolChains : ToolChains<[
+       ToolChain<[llvm_gcc_c, llc, llvm_gcc_assembler, llvm_gcc_linker]>,
+       ToolChain<[llvm_gcc_cpp, llc, llvm_gcc_assembler, llvm_gcc_linker]>,
+       ...
+       ]>;
+
+Every configuration file should have a single toolchains list called
+``ToolChains``.
+
+At the time of writing, ``llvmcc`` does not support mixing various
+toolchains together - in other words, all input files should be in the
+same language.
+
+Another temporary limitation is that every toolchain should end with a
+"join" node - a linker-like program that combines its inputs into a
+single output file.
+
+Describing a tool
+-----------------
+
+A single element of a toolchain is a tool. A tool definition looks
+like this (taken from the Tools.td file)::
+
+  def llvm_gcc_cpp : Tool<[
+      (in_language "c++"),
+      (out_language "llvm-assembler"),
+      (output_suffix "bc"),
+      (cmd_line "llvm-g++ -c $INFILE -o $OUTFILE -emit-llvm"),
+      (sink)
+      ]>;
+
+This defines a new tool called ``llvm_gcc_cpp``, which is an alias for
+``llvm-g++``. As you can see, a tool definition is just a list of
+properties; most of them should be self-evident. The ``sink`` property
+means that this tool should be passed all command-line options that
+aren't handled by the other tools.
+
+The complete list of the currently implemented tool properties follows:
+
+* Possible tool properties:
+  - in_language - input language name.
+
+  - out_language - output language name.
+
+  - output_suffix - output file suffix.
+
+  - cmd_line - the actual command used to run the tool. You can use
+    ``$INFILE`` and ``$OUTFILE`` variables.
+
+  - join - this tool is a "join node" in the graph, i.e. it gets a
+    list of input files and joins them together. Used for linkers.
+
+  - sink - all command-line options that are not handled by other
+    tools are passed to this tool.
+
+The next tool definition is slightly more complex::
+
+  def llvm_gcc_linker : Tool<[
+      (in_language "object-code"),
+      (out_language "executable"),
+      (output_suffix "out"),
+      (cmd_line "llvm-gcc $INFILE -o $OUTFILE"),
+      (join),
+      (prefix_list_option "L", (forward), (help "add a directory to link path")),
+      (prefix_list_option "l", (forward), (help "search a library when linking")),
+      (prefix_list_option "Wl", (unpack_values), (help "pass options to linker"))
+      ]>;
+
+This tool has a "join" property, which means that it behaves like a
+linker (because of that this tool should be the last in the
+toolchain). This tool also defines several command-line options: ``-l``,
+``-L`` and ``-Wl`` which have their usual meaning. An option has two
+attributes: a name and a (possibly empty) list of properties. All
+currently implemented option types and properties are described below:
+
+* Possible option types:
+   - switch_option - a simple boolean switch, for example ``-time``.
+
+   - parameter_option - option that takes an argument, for example ``-std=c99``;
+
+   - parameter_list_option - same as the above, but more than one
+     occurence of the option is allowed.
+
+   - prefix_option - same as the parameter_option, but the option name
+     and parameter value are not separated.
+
+   - prefix_list_option - same as the above, but more than one
+     occurence of the option is allowed; example: ``-lm -lpthread``.
+
+* Possible option properties:
+   - append_cmd - append a string to the tool invocation command.
+
+   - forward - forward this option unchanged.
+
+   - stop_compilation - stop compilation after this phase.
+
+   - unpack_values - used for for splitting and forwarding
+     comma-separated lists of options, e.g. ``-Wa,-foo=bar,-baz`` is
+     converted to ``-foo=bar -baz`` and appended to the tool invocation
+     command.
+
+   - help - help string associated with this option.
+
+   - required - this option is obligatory.
+
+Language map
+------------
+
+One last bit that you probably should change is the language map,
+which defines mappings between language names and file extensions. It
+is used internally to choose the proper toolchain based on the names
+of the input files. Language map definition is located in the file
+``Tools.td`` and looks like this::
+
+    def LanguageMap : LanguageMap<
+        [LangToSuffixes<"c++", ["cc", "cp", "cxx", "cpp", "CPP", "c++", "C"]>,
+         LangToSuffixes<"c", ["c"]>,
+         ...
+        ]>;
+
+
+Putting it all together
+-----------------------
+
+Since at the time of writing LLVMCC does not support on-the-fly
+reloading of the configuration, the only way to test your changes is
+to recompile the program. To do this, ``cd`` to the source code
+directory and run ``make``.
+
+References
+==========
+
+.. [1] TableGen Fundamentals
+       http://llvm.cs.uiuc.edu/docs/TableGenFundamentals.html