blob: 506d2077a633a9c759cf1a4732828ca4edcb662f [file] [log] [blame]
Sean Silva26b8aab2013-01-07 02:43:44 +00001===========================
2TableGen Language Reference
3===========================
4
5.. sectionauthor:: Sean Silva <silvas@purdue.edu>
6
7.. contents::
8 :local:
9
10.. warning::
11 This document is extremely rough. If you find something lacking, please
12 fix it, file a documentation bug, or ask about it on llvmdev.
13
14Introduction
15============
16
17This document is meant to be a normative spec about the TableGen language
18in and of itself (i.e. how to understand a given construct in terms of how
19it affects the final set of records represented by the TableGen file). If
20you are unsure if this document is really what you are looking for, please
21read :doc:`/TableGenFundamentals` first.
22
23Notation
24========
25
26The lexical and syntax notation used here is intended to imitate
27`Python's`_. In particular, for lexical definitions, the productions
28operate at the character level and there is no implied whitespace between
29elements. The syntax definitions operate at the token level, so there is
30implied whitespace between tokens.
31
32.. _`Python's`: http://docs.python.org/py3k/reference/introduction.html#notation
33
34Lexical Analysis
35================
36
37TableGen supports BCPL (``// ...``) and nestable C-style (``/* ... */``)
38comments.
39
40The following is a listing of the basic punctuation tokens::
41
42 - + [ ] { } ( ) < > : ; . = ? #
43
44Numeric literals take one of the following forms:
45
46.. TableGen actually will lex some pretty strange sequences an interpret
47 them as numbers. What is shown here is an attempt to approximate what it
48 "should" accept.
49
50.. productionlist::
51 TokInteger: `DecimalInteger` | `HexInteger` | `BinInteger`
52 DecimalInteger: ["+" | "-"] ("0"..."9")+
53 HexInteger: "0x" ("0"..."9" | "a"..."f" | "A"..."F")+
54 BinInteger: "0b" ("0" | "1")+
55
56One aspect to note is that the :token:`DecimalInteger` token *includes* the
57``+`` or ``-``, as opposed to having ``+`` and ``-`` be unary operators as
58most languages do.
59
60TableGen has identifier-like tokens:
61
62.. productionlist::
63 ualpha: "a"..."z" | "A"..."Z" | "_"
64 TokIdentifier: ("0"..."9")* `ualpha` (`ualpha` | "0"..."9")*
65 TokVarName: "$" `ualpha` (`ualpha` | "0"..."9")*
66
67Note that unlike most languages, TableGen allows :token:`TokIdentifier` to
68begin with a number. In case of ambiguity, a token will be interpreted as a
69numeric literal rather than an identifier.
70
71TableGen also has two string-like literals:
72
73.. productionlist::
74 TokString: '"' <non-'"' characters and C-like escapes> '"'
75 TokCodeFragment: "[{" <shortest text not containing "}]"> "}]"
76
Sean Silva104f2b52013-01-09 02:20:30 +000077.. note::
78 The current implementation accepts the following C-like escapes::
79
80 \\ \' \" \t \n
81
Sean Silva26b8aab2013-01-07 02:43:44 +000082TableGen also has the following keywords::
83
84 bit bits class code dag
85 def foreach defm field in
86 int let list multiclass string
87
88TableGen also has "bang operators" which have a
89wide variety of meanings::
90
91 !eq !if !head !tail !con
92 !shl !sra !srl
93 !cast !empty !subst !foreach !strconcat
94
95Syntax
96======
97
98TableGen has an ``include`` mechanism. It does not play a role in the
99syntax per se, since it is lexically replaced with the contents of the
100included file.
101
102.. productionlist::
103 IncludeDirective: "include" `TokString`
104
105TableGen's top-level production consists of "objects".
106
107.. productionlist::
108 TableGenFile: `Object`*
109 Object: `Class` | `Def` | `Defm` | `Let` | `MultiClass` | `Foreach`
110
111``class``\es
112------------
113
114.. productionlist::
115 Class: "class" `TokIdentifier` [`TemplateArgList`] `ObjectBody`
116
117A ``class`` declaration creates a record which other records can inherit
118from. A class can be parametrized by a list of "template arguments", whose
119values can be used in the class body.
120
121A given class can only be defined once. A ``class`` declaration is
122considered to define the class if any of the following is true:
123
124.. break ObjectBody into its consituents so that they are present here?
125
126#. The :token:`TemplateArgList` is present.
127#. The :token:`Body` in the :token:`ObjectBody` is present and is not empty.
128#. The :token:`BaseClassList` in the :token:`ObjectBody` is present.
129
130You can declare an empty class by giving and empty :token:`TemplateArgList`
131and an empty :token:`ObjectBody`. This can serve as a restricted form of
132forward declaration: note that records deriving from the forward-declared
133class will inherit no fields from it since the record expansion is done
134when the record is parsed.
135
136.. productionlist::
137 TemplateArgList: "<" `Declaration` ("," `Declaration`)* ">"
138
139Declarations
140------------
141
142.. Omitting mention of arcane "field" prefix to discourage its use.
143
144The declaration syntax is pretty much what you would expect as a C++
145programmer.
146
147.. productionlist::
148 Declaration: `Type` `TokIdentifier` ["=" `Value`]
149
150It assigns the value to the identifer.
151
152Types
153-----
154
155.. productionlist::
156 Type: "string" | "code" | "bit" | "int" | "dag"
157 :| "bits" "<" `TokInteger` ">"
158 :| "list" "<" `Type` ">"
159 :| `ClassID`
160 ClassID: `TokIdentifier`
161
162Both ``string`` and ``code`` correspond to the string type; the difference
163is purely to indicate programmer intention.
164
165The :token:`ClassID` must identify a class that has been previously
166declared or defined.
167
168Values
169------
170
171.. productionlist::
172 Value: `SimpleValue` `ValueSuffix`*
173 ValueSuffix: "{" `RangeList` "}"
174 :| "[" `RangeList` "]"
175 :| "." `TokIdentifier`
176 RangeList: `RangePiece` ("," `RangePiece`)*
177 RangePiece: `TokInteger`
178 :| `TokInteger` "-" `TokInteger`
179 :| `TokInteger` `TokInteger`
180
181The peculiar last form of :token:`RangePiece` is due to the fact that the
182"``-``" is included in the :token:`TokInteger`, hence ``1-5`` gets lexed as
183two consecutive :token:`TokInteger`'s, with values ``1`` and ``-5``,
184instead of "1", "-", and "5".
185The :token:`RangeList` can be thought of as specifying "list slice" in some
186contexts.
187
188
189:token:`SimpleValue` has a number of forms:
190
191
192.. productionlist::
193 SimpleValue: `TokIdentifier`
194
195The value will be the variable referenced by the identifier. It can be one
196of:
197
198.. The code for this is exceptionally abstruse. These examples are a
199 best-effort attempt.
200
201* name of a ``def``, such as the use of ``Bar`` in::
202
203 def Bar : SomeClass {
204 int X = 5;
205 }
206
207 def Foo {
208 SomeClass Baz = Bar;
209 }
210
211* value local to a ``def``, such as the use of ``Bar`` in::
212
213 def Foo {
214 int Bar = 5;
215 int Baz = Bar;
216 }
217
218* a template arg of a ``class``, such as the use of ``Bar`` in::
219
220 class Foo<int Bar> {
221 int Baz = Bar;
222 }
223
224* value local to a ``multiclass``, such as the use of ``Bar`` in::
225
226 multiclass Foo {
227 int Bar = 5;
228 int Baz = Bar;
229 }
230
231* a template arg to a ``multiclass``, such as the use of ``Bar`` in::
232
233 multiclass Foo<int Bar> {
234 int Baz = Bar;
235 }
236
237.. productionlist::
238 SimpleValue: `TokInteger`
239
240This represents the numeric value of the integer.
241
242.. productionlist::
243 SimpleValue: `TokString`+
244
245Multiple adjacent string literals are concatenated like in C/C++. The value
246is the concatenation of the strings.
247
248.. productionlist::
249 SimpleValue: `TokCodeFragment`
250
251The value is the string value of the code fragment.
252
253.. productionlist::
254 SimpleValue: "?"
255
256``?`` represents an "unset" initializer.
257
258.. productionlist::
259 SimpleValue: "{" `ValueList` "}"
260 ValueList: [`ValueListNE`]
261 ValueListNE: `Value` ("," `Value`)*
262
263This represents a sequence of bits, as would be used to initialize a
264``bits<n>`` field (where ``n`` is the number of bits).
265
266.. productionlist::
267 SimpleValue: `ClassID` "<" `ValueListNE` ">"
268
269This generates a new anonymous record definition (as would be created by an
270unnamed ``def`` inheriting from the given class with the given template
271arguments) and the value is the value of that record definition.
272
273.. productionlist::
274 SimpleValue: "[" `ValueList` "]" ["<" `Type` ">"]
275
276A list initializer. The optional :token:`Type` can be used to indicate a
277specific element type, otherwise the element type will be deduced from the
278given values.
279
280.. The initial `DagArg` of the dag must start with an identifier or
281 !cast, but this is more of an implementation detail and so for now just
282 leave it out.
283
284.. productionlist::
285 SimpleValue: "(" `DagArg` `DagArgList` ")"
286 DagArgList: `DagArg` ("," `DagArg`)*
287 DagArg: `Value` [":" `TokVarName`]
288
289The initial :token:`DagArg` is called the "operator" of the dag.
290
291.. productionlist::
292 SimpleValue: `BangOperator` ["<" `Type` ">"] "(" `ValueListNE` ")"
293
294Bodies
295------
296
297.. productionlist::
298 ObjectBody: `BaseClassList` `Body`
299 BaseClassList: [`BaseClassListNE`]
300 BaseClassListNE: `SubClassRef` ("," `SubClassRef`)*
Sean Silvad155ffc2013-01-09 02:20:24 +0000301 SubClassRef: (`ClassID` | `MultiClassID`) ["<" `ValueList` ">"]
Sean Silva26b8aab2013-01-07 02:43:44 +0000302 DefmID: `TokIdentifier`
303
Sean Silvad155ffc2013-01-09 02:20:24 +0000304The version with the :token:`MultiClassID` is only valid in the
Sean Silva26b8aab2013-01-07 02:43:44 +0000305:token:`BaseClassList` of a ``defm``.
Sean Silvad155ffc2013-01-09 02:20:24 +0000306The :token:`MultiClassID` should be the name of a ``multiclass``.
Sean Silva26b8aab2013-01-07 02:43:44 +0000307
308.. put this somewhere else
309
310It is after parsing the base class list that the "let stack" is applied.
311
312.. productionlist::
313 Body: ";" | "{" BodyList "}"
314 BodyList: BodyItem*
315 BodyItem: `Declaration` ";"
316 :| "let" `TokIdentifier` [`RangeList`] "=" `Value` ";"
317
318The ``let`` form allows overriding the value of an inherited field.
319
320``def``
321-------
322
323.. TODO::
324 There can be pastes in the names here, like ``#NAME#``. Look into that
325 and document it (it boils down to ParseIDValue with IDParseMode ==
326 ParseNameMode). ParseObjectName calls into the general ParseValue, with
327 the only different from "arbitrary expression parsing" being IDParseMode
328 == Mode.
329
330.. productionlist::
331 Def: "def" `TokIdentifier` `ObjectBody`
332
333Defines a record whose name is given by the :token:`TokIdentifier`. The
334fields of the record are inherited from the base classes and defined in the
335body.
336
337Special handling occurs if this ``def`` appears inside a ``multiclass`` or
338a ``foreach``.
339
340``defm``
341--------
342
343.. productionlist::
344 Defm: "defm" `TokIdentifier` ":" `BaseClassList` ";"
345
346Note that in the :token:`BaseClassList`, all of the ``multiclass``'s must
347precede any ``class``'s that appear.
348
349``foreach``
350-----------
351
352.. productionlist::
353 Foreach: "foreach" `Declaration` "in" "{" `Object`* "}"
354 :| "foreach" `Declaration` "in" `Object`
355
356The value assigned to the variable in the declaration is iterated over and
357the object or object list is reevaluated with the variable set at each
358iterated value.
359
360Top-Level ``let``
361-----------------
362
363.. productionlist::
364 Let: "let" `LetList` "in" "{" `Object`* "}"
365 :| "let" `LetList` "in" `Object`
366 LetList: `LetItem` ("," `LetItem`)*
367 LetItem: `TokIdentifier` [`RangeList`] "=" `Value`
368
369This is effectively equivalent to ``let`` inside the body of a record
370except that it applies to multiple records at a time. The bindings are
371applied at the end of parsing the base classes of a record.
372
373``multiclass``
374--------------
375
376.. productionlist::
377 MultiClass: "multiclass" `TokIdentifier` [`TemplateArgList`]
Sean Silva9302dcc2013-01-09 02:11:55 +0000378 : [":" `BaseMultiClassList`] "{" `MultiClassObject`+ "}"
Sean Silva26b8aab2013-01-07 02:43:44 +0000379 BaseMultiClassList: `MultiClassID` ("," `MultiClassID`)*
380 MultiClassID: `TokIdentifier`
Sean Silva9302dcc2013-01-09 02:11:55 +0000381 MultiClassObject: `Def` | `Defm` | `Let` | `Foreach`