blob: 42819987349a2fb8766f2ec48b50c2fac500e1ab [file] [log] [blame]
Sean Silva26b8aab2013-01-07 02:43:44 +00001===========================
2TableGen Language Reference
3===========================
4
5.. sectionauthor:: Sean Silva <silvas@purdue.edu>
6
7.. contents::
8 :local:
9
10.. warning::
11 This document is extremely rough. If you find something lacking, please
12 fix it, file a documentation bug, or ask about it on llvmdev.
13
14Introduction
15============
16
17This document is meant to be a normative spec about the TableGen language
18in and of itself (i.e. how to understand a given construct in terms of how
19it affects the final set of records represented by the TableGen file). If
20you are unsure if this document is really what you are looking for, please
21read :doc:`/TableGenFundamentals` first.
22
23Notation
24========
25
26The lexical and syntax notation used here is intended to imitate
27`Python's`_. In particular, for lexical definitions, the productions
28operate at the character level and there is no implied whitespace between
29elements. The syntax definitions operate at the token level, so there is
30implied whitespace between tokens.
31
32.. _`Python's`: http://docs.python.org/py3k/reference/introduction.html#notation
33
34Lexical Analysis
35================
36
37TableGen supports BCPL (``// ...``) and nestable C-style (``/* ... */``)
38comments.
39
40The following is a listing of the basic punctuation tokens::
41
42 - + [ ] { } ( ) < > : ; . = ? #
43
44Numeric literals take one of the following forms:
45
46.. TableGen actually will lex some pretty strange sequences an interpret
47 them as numbers. What is shown here is an attempt to approximate what it
48 "should" accept.
49
50.. productionlist::
51 TokInteger: `DecimalInteger` | `HexInteger` | `BinInteger`
52 DecimalInteger: ["+" | "-"] ("0"..."9")+
53 HexInteger: "0x" ("0"..."9" | "a"..."f" | "A"..."F")+
54 BinInteger: "0b" ("0" | "1")+
55
56One aspect to note is that the :token:`DecimalInteger` token *includes* the
57``+`` or ``-``, as opposed to having ``+`` and ``-`` be unary operators as
58most languages do.
59
60TableGen has identifier-like tokens:
61
62.. productionlist::
63 ualpha: "a"..."z" | "A"..."Z" | "_"
64 TokIdentifier: ("0"..."9")* `ualpha` (`ualpha` | "0"..."9")*
65 TokVarName: "$" `ualpha` (`ualpha` | "0"..."9")*
66
67Note that unlike most languages, TableGen allows :token:`TokIdentifier` to
68begin with a number. In case of ambiguity, a token will be interpreted as a
69numeric literal rather than an identifier.
70
71TableGen also has two string-like literals:
72
73.. productionlist::
74 TokString: '"' <non-'"' characters and C-like escapes> '"'
75 TokCodeFragment: "[{" <shortest text not containing "}]"> "}]"
76
77TableGen also has the following keywords::
78
79 bit bits class code dag
80 def foreach defm field in
81 int let list multiclass string
82
83TableGen also has "bang operators" which have a
84wide variety of meanings::
85
86 !eq !if !head !tail !con
87 !shl !sra !srl
88 !cast !empty !subst !foreach !strconcat
89
90Syntax
91======
92
93TableGen has an ``include`` mechanism. It does not play a role in the
94syntax per se, since it is lexically replaced with the contents of the
95included file.
96
97.. productionlist::
98 IncludeDirective: "include" `TokString`
99
100TableGen's top-level production consists of "objects".
101
102.. productionlist::
103 TableGenFile: `Object`*
104 Object: `Class` | `Def` | `Defm` | `Let` | `MultiClass` | `Foreach`
105
106``class``\es
107------------
108
109.. productionlist::
110 Class: "class" `TokIdentifier` [`TemplateArgList`] `ObjectBody`
111
112A ``class`` declaration creates a record which other records can inherit
113from. A class can be parametrized by a list of "template arguments", whose
114values can be used in the class body.
115
116A given class can only be defined once. A ``class`` declaration is
117considered to define the class if any of the following is true:
118
119.. break ObjectBody into its consituents so that they are present here?
120
121#. The :token:`TemplateArgList` is present.
122#. The :token:`Body` in the :token:`ObjectBody` is present and is not empty.
123#. The :token:`BaseClassList` in the :token:`ObjectBody` is present.
124
125You can declare an empty class by giving and empty :token:`TemplateArgList`
126and an empty :token:`ObjectBody`. This can serve as a restricted form of
127forward declaration: note that records deriving from the forward-declared
128class will inherit no fields from it since the record expansion is done
129when the record is parsed.
130
131.. productionlist::
132 TemplateArgList: "<" `Declaration` ("," `Declaration`)* ">"
133
134Declarations
135------------
136
137.. Omitting mention of arcane "field" prefix to discourage its use.
138
139The declaration syntax is pretty much what you would expect as a C++
140programmer.
141
142.. productionlist::
143 Declaration: `Type` `TokIdentifier` ["=" `Value`]
144
145It assigns the value to the identifer.
146
147Types
148-----
149
150.. productionlist::
151 Type: "string" | "code" | "bit" | "int" | "dag"
152 :| "bits" "<" `TokInteger` ">"
153 :| "list" "<" `Type` ">"
154 :| `ClassID`
155 ClassID: `TokIdentifier`
156
157Both ``string`` and ``code`` correspond to the string type; the difference
158is purely to indicate programmer intention.
159
160The :token:`ClassID` must identify a class that has been previously
161declared or defined.
162
163Values
164------
165
166.. productionlist::
167 Value: `SimpleValue` `ValueSuffix`*
168 ValueSuffix: "{" `RangeList` "}"
169 :| "[" `RangeList` "]"
170 :| "." `TokIdentifier`
171 RangeList: `RangePiece` ("," `RangePiece`)*
172 RangePiece: `TokInteger`
173 :| `TokInteger` "-" `TokInteger`
174 :| `TokInteger` `TokInteger`
175
176The peculiar last form of :token:`RangePiece` is due to the fact that the
177"``-``" is included in the :token:`TokInteger`, hence ``1-5`` gets lexed as
178two consecutive :token:`TokInteger`'s, with values ``1`` and ``-5``,
179instead of "1", "-", and "5".
180The :token:`RangeList` can be thought of as specifying "list slice" in some
181contexts.
182
183
184:token:`SimpleValue` has a number of forms:
185
186
187.. productionlist::
188 SimpleValue: `TokIdentifier`
189
190The value will be the variable referenced by the identifier. It can be one
191of:
192
193.. The code for this is exceptionally abstruse. These examples are a
194 best-effort attempt.
195
196* name of a ``def``, such as the use of ``Bar`` in::
197
198 def Bar : SomeClass {
199 int X = 5;
200 }
201
202 def Foo {
203 SomeClass Baz = Bar;
204 }
205
206* value local to a ``def``, such as the use of ``Bar`` in::
207
208 def Foo {
209 int Bar = 5;
210 int Baz = Bar;
211 }
212
213* a template arg of a ``class``, such as the use of ``Bar`` in::
214
215 class Foo<int Bar> {
216 int Baz = Bar;
217 }
218
219* value local to a ``multiclass``, such as the use of ``Bar`` in::
220
221 multiclass Foo {
222 int Bar = 5;
223 int Baz = Bar;
224 }
225
226* a template arg to a ``multiclass``, such as the use of ``Bar`` in::
227
228 multiclass Foo<int Bar> {
229 int Baz = Bar;
230 }
231
232.. productionlist::
233 SimpleValue: `TokInteger`
234
235This represents the numeric value of the integer.
236
237.. productionlist::
238 SimpleValue: `TokString`+
239
240Multiple adjacent string literals are concatenated like in C/C++. The value
241is the concatenation of the strings.
242
243.. productionlist::
244 SimpleValue: `TokCodeFragment`
245
246The value is the string value of the code fragment.
247
248.. productionlist::
249 SimpleValue: "?"
250
251``?`` represents an "unset" initializer.
252
253.. productionlist::
254 SimpleValue: "{" `ValueList` "}"
255 ValueList: [`ValueListNE`]
256 ValueListNE: `Value` ("," `Value`)*
257
258This represents a sequence of bits, as would be used to initialize a
259``bits<n>`` field (where ``n`` is the number of bits).
260
261.. productionlist::
262 SimpleValue: `ClassID` "<" `ValueListNE` ">"
263
264This generates a new anonymous record definition (as would be created by an
265unnamed ``def`` inheriting from the given class with the given template
266arguments) and the value is the value of that record definition.
267
268.. productionlist::
269 SimpleValue: "[" `ValueList` "]" ["<" `Type` ">"]
270
271A list initializer. The optional :token:`Type` can be used to indicate a
272specific element type, otherwise the element type will be deduced from the
273given values.
274
275.. The initial `DagArg` of the dag must start with an identifier or
276 !cast, but this is more of an implementation detail and so for now just
277 leave it out.
278
279.. productionlist::
280 SimpleValue: "(" `DagArg` `DagArgList` ")"
281 DagArgList: `DagArg` ("," `DagArg`)*
282 DagArg: `Value` [":" `TokVarName`]
283
284The initial :token:`DagArg` is called the "operator" of the dag.
285
286.. productionlist::
287 SimpleValue: `BangOperator` ["<" `Type` ">"] "(" `ValueListNE` ")"
288
289Bodies
290------
291
292.. productionlist::
293 ObjectBody: `BaseClassList` `Body`
294 BaseClassList: [`BaseClassListNE`]
295 BaseClassListNE: `SubClassRef` ("," `SubClassRef`)*
Sean Silvad155ffc2013-01-09 02:20:24 +0000296 SubClassRef: (`ClassID` | `MultiClassID`) ["<" `ValueList` ">"]
Sean Silva26b8aab2013-01-07 02:43:44 +0000297 DefmID: `TokIdentifier`
298
Sean Silvad155ffc2013-01-09 02:20:24 +0000299The version with the :token:`MultiClassID` is only valid in the
Sean Silva26b8aab2013-01-07 02:43:44 +0000300:token:`BaseClassList` of a ``defm``.
Sean Silvad155ffc2013-01-09 02:20:24 +0000301The :token:`MultiClassID` should be the name of a ``multiclass``.
Sean Silva26b8aab2013-01-07 02:43:44 +0000302
303.. put this somewhere else
304
305It is after parsing the base class list that the "let stack" is applied.
306
307.. productionlist::
308 Body: ";" | "{" BodyList "}"
309 BodyList: BodyItem*
310 BodyItem: `Declaration` ";"
311 :| "let" `TokIdentifier` [`RangeList`] "=" `Value` ";"
312
313The ``let`` form allows overriding the value of an inherited field.
314
315``def``
316-------
317
318.. TODO::
319 There can be pastes in the names here, like ``#NAME#``. Look into that
320 and document it (it boils down to ParseIDValue with IDParseMode ==
321 ParseNameMode). ParseObjectName calls into the general ParseValue, with
322 the only different from "arbitrary expression parsing" being IDParseMode
323 == Mode.
324
325.. productionlist::
326 Def: "def" `TokIdentifier` `ObjectBody`
327
328Defines a record whose name is given by the :token:`TokIdentifier`. The
329fields of the record are inherited from the base classes and defined in the
330body.
331
332Special handling occurs if this ``def`` appears inside a ``multiclass`` or
333a ``foreach``.
334
335``defm``
336--------
337
338.. productionlist::
339 Defm: "defm" `TokIdentifier` ":" `BaseClassList` ";"
340
341Note that in the :token:`BaseClassList`, all of the ``multiclass``'s must
342precede any ``class``'s that appear.
343
344``foreach``
345-----------
346
347.. productionlist::
348 Foreach: "foreach" `Declaration` "in" "{" `Object`* "}"
349 :| "foreach" `Declaration` "in" `Object`
350
351The value assigned to the variable in the declaration is iterated over and
352the object or object list is reevaluated with the variable set at each
353iterated value.
354
355Top-Level ``let``
356-----------------
357
358.. productionlist::
359 Let: "let" `LetList` "in" "{" `Object`* "}"
360 :| "let" `LetList` "in" `Object`
361 LetList: `LetItem` ("," `LetItem`)*
362 LetItem: `TokIdentifier` [`RangeList`] "=" `Value`
363
364This is effectively equivalent to ``let`` inside the body of a record
365except that it applies to multiple records at a time. The bindings are
366applied at the end of parsing the base classes of a record.
367
368``multiclass``
369--------------
370
371.. productionlist::
372 MultiClass: "multiclass" `TokIdentifier` [`TemplateArgList`]
Sean Silva9302dcc2013-01-09 02:11:55 +0000373 : [":" `BaseMultiClassList`] "{" `MultiClassObject`+ "}"
Sean Silva26b8aab2013-01-07 02:43:44 +0000374 BaseMultiClassList: `MultiClassID` ("," `MultiClassID`)*
375 MultiClassID: `TokIdentifier`
Sean Silva9302dcc2013-01-09 02:11:55 +0000376 MultiClassObject: `Def` | `Defm` | `Let` | `Foreach`