|  | =========================== | 
|  | TableGen Language Reference | 
|  | =========================== | 
|  |  | 
|  | .. contents:: | 
|  | :local: | 
|  |  | 
|  | .. warning:: | 
|  | This document is extremely rough. If you find something lacking, please | 
|  | fix it, file a documentation bug, or ask about it on llvm-dev. | 
|  |  | 
|  | Introduction | 
|  | ============ | 
|  |  | 
|  | This document is meant to be a normative spec about the TableGen language | 
|  | in and of itself (i.e. how to understand a given construct in terms of how | 
|  | it affects the final set of records represented by the TableGen file). If | 
|  | you are unsure if this document is really what you are looking for, please | 
|  | read the :doc:`introduction to TableGen <index>` first. | 
|  |  | 
|  | Notation | 
|  | ======== | 
|  |  | 
|  | The lexical and syntax notation used here is intended to imitate | 
|  | `Python's`_. In particular, for lexical definitions, the productions | 
|  | operate at the character level and there is no implied whitespace between | 
|  | elements. The syntax definitions operate at the token level, so there is | 
|  | implied whitespace between tokens. | 
|  |  | 
|  | .. _`Python's`: http://docs.python.org/py3k/reference/introduction.html#notation | 
|  |  | 
|  | Lexical Analysis | 
|  | ================ | 
|  |  | 
|  | TableGen supports BCPL (``// ...``) and nestable C-style (``/* ... */``) | 
|  | comments. | 
|  |  | 
|  | The following is a listing of the basic punctuation tokens:: | 
|  |  | 
|  | - + [ ] { } ( ) < > : ; .  = ? # | 
|  |  | 
|  | Numeric literals take one of the following forms: | 
|  |  | 
|  | .. TableGen actually will lex some pretty strange sequences an interpret | 
|  | them as numbers. What is shown here is an attempt to approximate what it | 
|  | "should" accept. | 
|  |  | 
|  | .. productionlist:: | 
|  | TokInteger: `DecimalInteger` | `HexInteger` | `BinInteger` | 
|  | DecimalInteger: ["+" | "-"] ("0"..."9")+ | 
|  | HexInteger: "0x" ("0"..."9" | "a"..."f" | "A"..."F")+ | 
|  | BinInteger: "0b" ("0" | "1")+ | 
|  |  | 
|  | One aspect to note is that the :token:`DecimalInteger` token *includes* the | 
|  | ``+`` or ``-``, as opposed to having ``+`` and ``-`` be unary operators as | 
|  | most languages do. | 
|  |  | 
|  | Also note that :token:`BinInteger` creates a value of type ``bits<n>`` | 
|  | (where ``n`` is the number of bits).  This will implicitly convert to | 
|  | integers when needed. | 
|  |  | 
|  | TableGen has identifier-like tokens: | 
|  |  | 
|  | .. productionlist:: | 
|  | ualpha: "a"..."z" | "A"..."Z" | "_" | 
|  | TokIdentifier: ("0"..."9")* `ualpha` (`ualpha` | "0"..."9")* | 
|  | TokVarName: "$" `ualpha` (`ualpha` |  "0"..."9")* | 
|  |  | 
|  | Note that unlike most languages, TableGen allows :token:`TokIdentifier` to | 
|  | begin with a number. In case of ambiguity, a token will be interpreted as a | 
|  | numeric literal rather than an identifier. | 
|  |  | 
|  | TableGen also has two string-like literals: | 
|  |  | 
|  | .. productionlist:: | 
|  | TokString: '"' <non-'"' characters and C-like escapes> '"' | 
|  | TokCodeFragment: "[{" <shortest text not containing "}]"> "}]" | 
|  |  | 
|  | :token:`TokCodeFragment` is essentially a multiline string literal | 
|  | delimited by ``[{`` and ``}]``. | 
|  |  | 
|  | .. note:: | 
|  | The current implementation accepts the following C-like escapes:: | 
|  |  | 
|  | \\ \' \" \t \n | 
|  |  | 
|  | TableGen also has the following keywords:: | 
|  |  | 
|  | bit   bits      class   code         dag | 
|  | def   foreach   defm    field        in | 
|  | int   let       list    multiclass   string | 
|  |  | 
|  | TableGen also has "bang operators" which have a | 
|  | wide variety of meanings: | 
|  |  | 
|  | .. productionlist:: | 
|  | BangOperator: one of | 
|  | :!eq     !if      !head    !tail      !con | 
|  | :!add    !shl     !sra     !srl       !and | 
|  | :!or     !empty   !subst   !foreach   !strconcat | 
|  | :!cast   !listconcat       !size      !foldl | 
|  | :!isa    !dag     !le      !lt        !ge | 
|  | :!gt     !ne | 
|  |  | 
|  |  | 
|  | Syntax | 
|  | ====== | 
|  |  | 
|  | TableGen has an ``include`` mechanism. It does not play a role in the | 
|  | syntax per se, since it is lexically replaced with the contents of the | 
|  | included file. | 
|  |  | 
|  | .. productionlist:: | 
|  | IncludeDirective: "include" `TokString` | 
|  |  | 
|  | TableGen's top-level production consists of "objects". | 
|  |  | 
|  | .. productionlist:: | 
|  | TableGenFile: `Object`* | 
|  | Object: `Class` | `Def` | `Defm` | `Defset` | `Let` | `MultiClass` | | 
|  | `Foreach` | 
|  |  | 
|  | ``class``\es | 
|  | ------------ | 
|  |  | 
|  | .. productionlist:: | 
|  | Class: "class" `TokIdentifier` [`TemplateArgList`] `ObjectBody` | 
|  | TemplateArgList: "<" `Declaration` ("," `Declaration`)* ">" | 
|  |  | 
|  | A ``class`` declaration creates a record which other records can inherit | 
|  | from. A class can be parametrized by a list of "template arguments", whose | 
|  | values can be used in the class body. | 
|  |  | 
|  | A given class can only be defined once. A ``class`` declaration is | 
|  | considered to define the class if any of the following is true: | 
|  |  | 
|  | .. break ObjectBody into its consituents so that they are present here? | 
|  |  | 
|  | #. The :token:`TemplateArgList` is present. | 
|  | #. The :token:`Body` in the :token:`ObjectBody` is present and is not empty. | 
|  | #. The :token:`BaseClassList` in the :token:`ObjectBody` is present. | 
|  |  | 
|  | You can declare an empty class by giving and empty :token:`TemplateArgList` | 
|  | and an empty :token:`ObjectBody`. This can serve as a restricted form of | 
|  | forward declaration: note that records deriving from the forward-declared | 
|  | class will inherit no fields from it since the record expansion is done | 
|  | when the record is parsed. | 
|  |  | 
|  | Every class has an implicit template argument called ``NAME``, which is set | 
|  | to the name of the instantiating ``def`` or ``defm``. The result is undefined | 
|  | if the class is instantiated by an anonymous record. | 
|  |  | 
|  | Declarations | 
|  | ------------ | 
|  |  | 
|  | .. Omitting mention of arcane "field" prefix to discourage its use. | 
|  |  | 
|  | The declaration syntax is pretty much what you would expect as a C++ | 
|  | programmer. | 
|  |  | 
|  | .. productionlist:: | 
|  | Declaration: `Type` `TokIdentifier` ["=" `Value`] | 
|  |  | 
|  | It assigns the value to the identifier. | 
|  |  | 
|  | Types | 
|  | ----- | 
|  |  | 
|  | .. productionlist:: | 
|  | Type: "string" | "code" | "bit" | "int" | "dag" | 
|  | :| "bits" "<" `TokInteger` ">" | 
|  | :| "list" "<" `Type` ">" | 
|  | :| `ClassID` | 
|  | ClassID: `TokIdentifier` | 
|  |  | 
|  | Both ``string`` and ``code`` correspond to the string type; the difference | 
|  | is purely to indicate programmer intention. | 
|  |  | 
|  | The :token:`ClassID` must identify a class that has been previously | 
|  | declared or defined. | 
|  |  | 
|  | Values | 
|  | ------ | 
|  |  | 
|  | .. productionlist:: | 
|  | Value: `SimpleValue` `ValueSuffix`* | 
|  | ValueSuffix: "{" `RangeList` "}" | 
|  | :| "[" `RangeList` "]" | 
|  | :| "." `TokIdentifier` | 
|  | RangeList: `RangePiece` ("," `RangePiece`)* | 
|  | RangePiece: `TokInteger` | 
|  | :| `TokInteger` "-" `TokInteger` | 
|  | :| `TokInteger` `TokInteger` | 
|  |  | 
|  | The peculiar last form of :token:`RangePiece` is due to the fact that the | 
|  | "``-``" is included in the :token:`TokInteger`, hence ``1-5`` gets lexed as | 
|  | two consecutive :token:`TokInteger`'s, with values ``1`` and ``-5``, | 
|  | instead of "1", "-", and "5". | 
|  | The :token:`RangeList` can be thought of as specifying "list slice" in some | 
|  | contexts. | 
|  |  | 
|  |  | 
|  | :token:`SimpleValue` has a number of forms: | 
|  |  | 
|  |  | 
|  | .. productionlist:: | 
|  | SimpleValue: `TokIdentifier` | 
|  |  | 
|  | The value will be the variable referenced by the identifier. It can be one | 
|  | of: | 
|  |  | 
|  | .. The code for this is exceptionally abstruse. These examples are a | 
|  | best-effort attempt. | 
|  |  | 
|  | * name of a ``def``, such as the use of ``Bar`` in:: | 
|  |  | 
|  | def Bar : SomeClass { | 
|  | int X = 5; | 
|  | } | 
|  |  | 
|  | def Foo { | 
|  | SomeClass Baz = Bar; | 
|  | } | 
|  |  | 
|  | * value local to a ``def``, such as the use of ``Bar`` in:: | 
|  |  | 
|  | def Foo { | 
|  | int Bar = 5; | 
|  | int Baz = Bar; | 
|  | } | 
|  |  | 
|  | Values defined in superclasses can be accessed the same way. | 
|  |  | 
|  | * a template arg of a ``class``, such as the use of ``Bar`` in:: | 
|  |  | 
|  | class Foo<int Bar> { | 
|  | int Baz = Bar; | 
|  | } | 
|  |  | 
|  | * value local to a ``class``, such as the use of ``Bar`` in:: | 
|  |  | 
|  | class Foo { | 
|  | int Bar = 5; | 
|  | int Baz = Bar; | 
|  | } | 
|  |  | 
|  | * a template arg to a ``multiclass``, such as the use of ``Bar`` in:: | 
|  |  | 
|  | multiclass Foo<int Bar> { | 
|  | def : SomeClass<Bar>; | 
|  | } | 
|  |  | 
|  | * the iteration variable of a ``foreach``, such as the use of ``i`` in:: | 
|  |  | 
|  | foreach i = 0-5 in | 
|  | def Foo#i; | 
|  |  | 
|  | * a variable defined by ``defset`` | 
|  |  | 
|  | * the implicit template argument ``NAME`` in a ``class`` or ``multiclass`` | 
|  |  | 
|  | .. productionlist:: | 
|  | SimpleValue: `TokInteger` | 
|  |  | 
|  | This represents the numeric value of the integer. | 
|  |  | 
|  | .. productionlist:: | 
|  | SimpleValue: `TokString`+ | 
|  |  | 
|  | Multiple adjacent string literals are concatenated like in C/C++. The value | 
|  | is the concatenation of the strings. | 
|  |  | 
|  | .. productionlist:: | 
|  | SimpleValue: `TokCodeFragment` | 
|  |  | 
|  | The value is the string value of the code fragment. | 
|  |  | 
|  | .. productionlist:: | 
|  | SimpleValue: "?" | 
|  |  | 
|  | ``?`` represents an "unset" initializer. | 
|  |  | 
|  | .. productionlist:: | 
|  | SimpleValue: "{" `ValueList` "}" | 
|  | ValueList: [`ValueListNE`] | 
|  | ValueListNE: `Value` ("," `Value`)* | 
|  |  | 
|  | This represents a sequence of bits, as would be used to initialize a | 
|  | ``bits<n>`` field (where ``n`` is the number of bits). | 
|  |  | 
|  | .. productionlist:: | 
|  | SimpleValue: `ClassID` "<" `ValueListNE` ">" | 
|  |  | 
|  | This generates a new anonymous record definition (as would be created by an | 
|  | unnamed ``def`` inheriting from the given class with the given template | 
|  | arguments) and the value is the value of that record definition. | 
|  |  | 
|  | .. productionlist:: | 
|  | SimpleValue: "[" `ValueList` "]" ["<" `Type` ">"] | 
|  |  | 
|  | A list initializer. The optional :token:`Type` can be used to indicate a | 
|  | specific element type, otherwise the element type will be deduced from the | 
|  | given values. | 
|  |  | 
|  | .. The initial `DagArg` of the dag must start with an identifier or | 
|  | !cast, but this is more of an implementation detail and so for now just | 
|  | leave it out. | 
|  |  | 
|  | .. productionlist:: | 
|  | SimpleValue: "(" `DagArg` [`DagArgList`] ")" | 
|  | DagArgList: `DagArg` ("," `DagArg`)* | 
|  | DagArg: `Value` [":" `TokVarName`] | `TokVarName` | 
|  |  | 
|  | The initial :token:`DagArg` is called the "operator" of the dag. | 
|  |  | 
|  | .. productionlist:: | 
|  | SimpleValue: `BangOperator` ["<" `Type` ">"] "(" `ValueListNE` ")" | 
|  |  | 
|  | Bodies | 
|  | ------ | 
|  |  | 
|  | .. productionlist:: | 
|  | ObjectBody: `BaseClassList` `Body` | 
|  | BaseClassList: [":" `BaseClassListNE`] | 
|  | BaseClassListNE: `SubClassRef` ("," `SubClassRef`)* | 
|  | SubClassRef: (`ClassID` | `MultiClassID`) ["<" `ValueList` ">"] | 
|  | DefmID: `TokIdentifier` | 
|  |  | 
|  | The version with the :token:`MultiClassID` is only valid in the | 
|  | :token:`BaseClassList` of a ``defm``. | 
|  | The :token:`MultiClassID` should be the name of a ``multiclass``. | 
|  |  | 
|  | .. put this somewhere else | 
|  |  | 
|  | It is after parsing the base class list that the "let stack" is applied. | 
|  |  | 
|  | .. productionlist:: | 
|  | Body: ";" | "{" BodyList "}" | 
|  | BodyList: BodyItem* | 
|  | BodyItem: `Declaration` ";" | 
|  | :| "let" `TokIdentifier` [ "{" `RangeList` "}" ] "=" `Value` ";" | 
|  |  | 
|  | The ``let`` form allows overriding the value of an inherited field. | 
|  |  | 
|  | ``def`` | 
|  | ------- | 
|  |  | 
|  | .. productionlist:: | 
|  | Def: "def" [`Value`] `ObjectBody` | 
|  |  | 
|  | Defines a record whose name is given by the optional :token:`Value`. The value | 
|  | is parsed in a special mode where global identifiers (records and variables | 
|  | defined by ``defset``) are not recognized, and all unrecognized identifiers | 
|  | are interpreted as strings. | 
|  |  | 
|  | If no name is given, the record is anonymous. The final name of anonymous | 
|  | records is undefined, but globally unique. | 
|  |  | 
|  | Special handling occurs if this ``def`` appears inside a ``multiclass`` or | 
|  | a ``foreach``. | 
|  |  | 
|  | When a non-anonymous record is defined in a multiclass and the given name | 
|  | does not contain a reference to the implicit template argument ``NAME``, such | 
|  | a reference will automatically be prepended. That is, the following are | 
|  | equivalent inside a multiclass:: | 
|  |  | 
|  | def Foo; | 
|  | def NAME#Foo; | 
|  |  | 
|  | ``defm`` | 
|  | -------- | 
|  |  | 
|  | .. productionlist:: | 
|  | Defm: "defm" [`Value`] ":" `BaseClassListNE` ";" | 
|  |  | 
|  | The :token:`BaseClassList` is a list of at least one ``multiclass`` and any | 
|  | number of ``class``'s. The ``multiclass``'s must occur before any ``class``'s. | 
|  |  | 
|  | Instantiates all records defined in all given ``multiclass``'s and adds the | 
|  | given ``class``'s as superclasses. | 
|  |  | 
|  | The name is parsed in the same special mode used by ``def``. If the name is | 
|  | missing, a globally unique string is used instead (but instantiated records | 
|  | are not considered to be anonymous, unless they were originally defined by an | 
|  | anonymous ``def``) That is, the following have different semantics:: | 
|  |  | 
|  | defm : SomeMultiClass<...>;    // some globally unique name | 
|  | defm "" : SomeMultiClass<...>; // empty name string | 
|  |  | 
|  | When it occurs inside a multiclass, the second variant is equivalent to | 
|  | ``defm NAME : ...``. More generally, when ``defm`` occurs in a multiclass and | 
|  | its name does not contain a reference to the implicit template argument | 
|  | ``NAME``, such a reference will automatically be prepended. That is, the | 
|  | following are equivalent inside a multiclass:: | 
|  |  | 
|  | defm Foo : SomeMultiClass<...>; | 
|  | defm NAME#Foo : SomeMultiClass<...>; | 
|  |  | 
|  | ``defset`` | 
|  | ---------- | 
|  | .. productionlist:: | 
|  | Defset: "defset" `Type` `TokIdentifier` "=" "{" `Object`* "}" | 
|  |  | 
|  | All records defined inside the braces via ``def`` and ``defm`` are collected | 
|  | in a globally accessible list of the given name (in addition to being added | 
|  | to the global collection of records as usual). Anonymous records created inside | 
|  | initializier expressions using the ``Class<args...>`` syntax are never collected | 
|  | in a defset. | 
|  |  | 
|  | The given type must be ``list<A>``, where ``A`` is some class. It is an error | 
|  | to define a record (via ``def`` or ``defm``) inside the braces which doesn't | 
|  | derive from ``A``. | 
|  |  | 
|  | ``foreach`` | 
|  | ----------- | 
|  |  | 
|  | .. productionlist:: | 
|  | Foreach: "foreach" `ForeachDeclaration` "in" "{" `Object`* "}" | 
|  | :| "foreach" `ForeachDeclaration` "in" `Object` | 
|  | ForeachDeclaration: ID "=" ( "{" `RangeList` "}" | `RangePiece` | `Value` ) | 
|  |  | 
|  | The value assigned to the variable in the declaration is iterated over and | 
|  | the object or object list is reevaluated with the variable set at each | 
|  | iterated value. | 
|  |  | 
|  | Note that the productions involving RangeList and RangePiece have precedence | 
|  | over the more generic value parsing based on the first token. | 
|  |  | 
|  | Top-Level ``let`` | 
|  | ----------------- | 
|  |  | 
|  | .. productionlist:: | 
|  | Let:  "let" `LetList` "in" "{" `Object`* "}" | 
|  | :| "let" `LetList` "in" `Object` | 
|  | LetList: `LetItem` ("," `LetItem`)* | 
|  | LetItem: `TokIdentifier` [`RangeList`] "=" `Value` | 
|  |  | 
|  | This is effectively equivalent to ``let`` inside the body of a record | 
|  | except that it applies to multiple records at a time. The bindings are | 
|  | applied at the end of parsing the base classes of a record. | 
|  |  | 
|  | ``multiclass`` | 
|  | -------------- | 
|  |  | 
|  | .. productionlist:: | 
|  | MultiClass: "multiclass" `TokIdentifier` [`TemplateArgList`] | 
|  | : [":" `BaseMultiClassList`] "{" `MultiClassObject`+ "}" | 
|  | BaseMultiClassList: `MultiClassID` ("," `MultiClassID`)* | 
|  | MultiClassID: `TokIdentifier` | 
|  | MultiClassObject: `Def` | `Defm` | `Let` | `Foreach` |