blob: 53fbe2d126dc318c7226d3f58f5586870185017d [file] [log] [blame]
Sean Silva26b8aab2013-01-07 02:43:44 +00001===========================
2TableGen Language Reference
3===========================
4
5.. sectionauthor:: Sean Silva <silvas@purdue.edu>
6
7.. contents::
8 :local:
9
10.. warning::
11 This document is extremely rough. If you find something lacking, please
12 fix it, file a documentation bug, or ask about it on llvmdev.
13
14Introduction
15============
16
17This document is meant to be a normative spec about the TableGen language
18in and of itself (i.e. how to understand a given construct in terms of how
19it affects the final set of records represented by the TableGen file). If
20you are unsure if this document is really what you are looking for, please
21read :doc:`/TableGenFundamentals` first.
22
23Notation
24========
25
26The lexical and syntax notation used here is intended to imitate
27`Python's`_. In particular, for lexical definitions, the productions
28operate at the character level and there is no implied whitespace between
29elements. The syntax definitions operate at the token level, so there is
30implied whitespace between tokens.
31
32.. _`Python's`: http://docs.python.org/py3k/reference/introduction.html#notation
33
34Lexical Analysis
35================
36
37TableGen supports BCPL (``// ...``) and nestable C-style (``/* ... */``)
38comments.
39
40The following is a listing of the basic punctuation tokens::
41
42 - + [ ] { } ( ) < > : ; . = ? #
43
44Numeric literals take one of the following forms:
45
46.. TableGen actually will lex some pretty strange sequences an interpret
47 them as numbers. What is shown here is an attempt to approximate what it
48 "should" accept.
49
50.. productionlist::
51 TokInteger: `DecimalInteger` | `HexInteger` | `BinInteger`
52 DecimalInteger: ["+" | "-"] ("0"..."9")+
53 HexInteger: "0x" ("0"..."9" | "a"..."f" | "A"..."F")+
54 BinInteger: "0b" ("0" | "1")+
55
56One aspect to note is that the :token:`DecimalInteger` token *includes* the
57``+`` or ``-``, as opposed to having ``+`` and ``-`` be unary operators as
58most languages do.
59
60TableGen has identifier-like tokens:
61
62.. productionlist::
63 ualpha: "a"..."z" | "A"..."Z" | "_"
64 TokIdentifier: ("0"..."9")* `ualpha` (`ualpha` | "0"..."9")*
65 TokVarName: "$" `ualpha` (`ualpha` | "0"..."9")*
66
67Note that unlike most languages, TableGen allows :token:`TokIdentifier` to
68begin with a number. In case of ambiguity, a token will be interpreted as a
69numeric literal rather than an identifier.
70
71TableGen also has two string-like literals:
72
73.. productionlist::
74 TokString: '"' <non-'"' characters and C-like escapes> '"'
75 TokCodeFragment: "[{" <shortest text not containing "}]"> "}]"
76
Sean Silva104f2b52013-01-09 02:20:30 +000077.. note::
78 The current implementation accepts the following C-like escapes::
79
80 \\ \' \" \t \n
81
Sean Silva26b8aab2013-01-07 02:43:44 +000082TableGen also has the following keywords::
83
84 bit bits class code dag
85 def foreach defm field in
86 int let list multiclass string
87
88TableGen also has "bang operators" which have a
Sean Silva96a05b32013-01-09 02:20:31 +000089wide variety of meanings:
Sean Silva26b8aab2013-01-07 02:43:44 +000090
Sean Silva96a05b32013-01-09 02:20:31 +000091.. productionlist::
92 BangOperator: one of
93 :!eq !if !head !tail !con
Hal Finkeld23a41c2013-01-25 14:49:08 +000094 :!add !shl !sra !srl
Sean Silva96a05b32013-01-09 02:20:31 +000095 :!cast !empty !subst !foreach !strconcat
Sean Silva26b8aab2013-01-07 02:43:44 +000096
97Syntax
98======
99
100TableGen has an ``include`` mechanism. It does not play a role in the
101syntax per se, since it is lexically replaced with the contents of the
102included file.
103
104.. productionlist::
105 IncludeDirective: "include" `TokString`
106
107TableGen's top-level production consists of "objects".
108
109.. productionlist::
110 TableGenFile: `Object`*
111 Object: `Class` | `Def` | `Defm` | `Let` | `MultiClass` | `Foreach`
112
113``class``\es
114------------
115
116.. productionlist::
117 Class: "class" `TokIdentifier` [`TemplateArgList`] `ObjectBody`
118
119A ``class`` declaration creates a record which other records can inherit
120from. A class can be parametrized by a list of "template arguments", whose
121values can be used in the class body.
122
123A given class can only be defined once. A ``class`` declaration is
124considered to define the class if any of the following is true:
125
126.. break ObjectBody into its consituents so that they are present here?
127
128#. The :token:`TemplateArgList` is present.
129#. The :token:`Body` in the :token:`ObjectBody` is present and is not empty.
130#. The :token:`BaseClassList` in the :token:`ObjectBody` is present.
131
132You can declare an empty class by giving and empty :token:`TemplateArgList`
133and an empty :token:`ObjectBody`. This can serve as a restricted form of
134forward declaration: note that records deriving from the forward-declared
135class will inherit no fields from it since the record expansion is done
136when the record is parsed.
137
138.. productionlist::
139 TemplateArgList: "<" `Declaration` ("," `Declaration`)* ">"
140
141Declarations
142------------
143
144.. Omitting mention of arcane "field" prefix to discourage its use.
145
146The declaration syntax is pretty much what you would expect as a C++
147programmer.
148
149.. productionlist::
150 Declaration: `Type` `TokIdentifier` ["=" `Value`]
151
152It assigns the value to the identifer.
153
154Types
155-----
156
157.. productionlist::
158 Type: "string" | "code" | "bit" | "int" | "dag"
159 :| "bits" "<" `TokInteger` ">"
160 :| "list" "<" `Type` ">"
161 :| `ClassID`
162 ClassID: `TokIdentifier`
163
164Both ``string`` and ``code`` correspond to the string type; the difference
165is purely to indicate programmer intention.
166
167The :token:`ClassID` must identify a class that has been previously
168declared or defined.
169
170Values
171------
172
173.. productionlist::
174 Value: `SimpleValue` `ValueSuffix`*
175 ValueSuffix: "{" `RangeList` "}"
176 :| "[" `RangeList` "]"
177 :| "." `TokIdentifier`
178 RangeList: `RangePiece` ("," `RangePiece`)*
179 RangePiece: `TokInteger`
180 :| `TokInteger` "-" `TokInteger`
181 :| `TokInteger` `TokInteger`
182
183The peculiar last form of :token:`RangePiece` is due to the fact that the
184"``-``" is included in the :token:`TokInteger`, hence ``1-5`` gets lexed as
185two consecutive :token:`TokInteger`'s, with values ``1`` and ``-5``,
186instead of "1", "-", and "5".
187The :token:`RangeList` can be thought of as specifying "list slice" in some
188contexts.
189
190
191:token:`SimpleValue` has a number of forms:
192
193
194.. productionlist::
195 SimpleValue: `TokIdentifier`
196
197The value will be the variable referenced by the identifier. It can be one
198of:
199
200.. The code for this is exceptionally abstruse. These examples are a
201 best-effort attempt.
202
203* name of a ``def``, such as the use of ``Bar`` in::
204
205 def Bar : SomeClass {
206 int X = 5;
207 }
208
209 def Foo {
210 SomeClass Baz = Bar;
211 }
212
213* value local to a ``def``, such as the use of ``Bar`` in::
214
215 def Foo {
216 int Bar = 5;
217 int Baz = Bar;
218 }
219
220* a template arg of a ``class``, such as the use of ``Bar`` in::
221
222 class Foo<int Bar> {
223 int Baz = Bar;
224 }
225
226* value local to a ``multiclass``, such as the use of ``Bar`` in::
227
228 multiclass Foo {
229 int Bar = 5;
230 int Baz = Bar;
231 }
232
233* a template arg to a ``multiclass``, such as the use of ``Bar`` in::
234
235 multiclass Foo<int Bar> {
236 int Baz = Bar;
237 }
238
239.. productionlist::
240 SimpleValue: `TokInteger`
241
242This represents the numeric value of the integer.
243
244.. productionlist::
245 SimpleValue: `TokString`+
246
247Multiple adjacent string literals are concatenated like in C/C++. The value
248is the concatenation of the strings.
249
250.. productionlist::
251 SimpleValue: `TokCodeFragment`
252
253The value is the string value of the code fragment.
254
255.. productionlist::
256 SimpleValue: "?"
257
258``?`` represents an "unset" initializer.
259
260.. productionlist::
261 SimpleValue: "{" `ValueList` "}"
262 ValueList: [`ValueListNE`]
263 ValueListNE: `Value` ("," `Value`)*
264
265This represents a sequence of bits, as would be used to initialize a
266``bits<n>`` field (where ``n`` is the number of bits).
267
268.. productionlist::
269 SimpleValue: `ClassID` "<" `ValueListNE` ">"
270
271This generates a new anonymous record definition (as would be created by an
272unnamed ``def`` inheriting from the given class with the given template
273arguments) and the value is the value of that record definition.
274
275.. productionlist::
276 SimpleValue: "[" `ValueList` "]" ["<" `Type` ">"]
277
278A list initializer. The optional :token:`Type` can be used to indicate a
279specific element type, otherwise the element type will be deduced from the
280given values.
281
282.. The initial `DagArg` of the dag must start with an identifier or
283 !cast, but this is more of an implementation detail and so for now just
284 leave it out.
285
286.. productionlist::
287 SimpleValue: "(" `DagArg` `DagArgList` ")"
288 DagArgList: `DagArg` ("," `DagArg`)*
289 DagArg: `Value` [":" `TokVarName`]
290
291The initial :token:`DagArg` is called the "operator" of the dag.
292
293.. productionlist::
294 SimpleValue: `BangOperator` ["<" `Type` ">"] "(" `ValueListNE` ")"
295
296Bodies
297------
298
299.. productionlist::
300 ObjectBody: `BaseClassList` `Body`
301 BaseClassList: [`BaseClassListNE`]
302 BaseClassListNE: `SubClassRef` ("," `SubClassRef`)*
Sean Silvad155ffc2013-01-09 02:20:24 +0000303 SubClassRef: (`ClassID` | `MultiClassID`) ["<" `ValueList` ">"]
Sean Silva26b8aab2013-01-07 02:43:44 +0000304 DefmID: `TokIdentifier`
305
Sean Silvad155ffc2013-01-09 02:20:24 +0000306The version with the :token:`MultiClassID` is only valid in the
Sean Silva26b8aab2013-01-07 02:43:44 +0000307:token:`BaseClassList` of a ``defm``.
Sean Silvad155ffc2013-01-09 02:20:24 +0000308The :token:`MultiClassID` should be the name of a ``multiclass``.
Sean Silva26b8aab2013-01-07 02:43:44 +0000309
310.. put this somewhere else
311
312It is after parsing the base class list that the "let stack" is applied.
313
314.. productionlist::
315 Body: ";" | "{" BodyList "}"
316 BodyList: BodyItem*
317 BodyItem: `Declaration` ";"
318 :| "let" `TokIdentifier` [`RangeList`] "=" `Value` ";"
319
320The ``let`` form allows overriding the value of an inherited field.
321
322``def``
323-------
324
325.. TODO::
326 There can be pastes in the names here, like ``#NAME#``. Look into that
327 and document it (it boils down to ParseIDValue with IDParseMode ==
328 ParseNameMode). ParseObjectName calls into the general ParseValue, with
329 the only different from "arbitrary expression parsing" being IDParseMode
330 == Mode.
331
332.. productionlist::
333 Def: "def" `TokIdentifier` `ObjectBody`
334
335Defines a record whose name is given by the :token:`TokIdentifier`. The
336fields of the record are inherited from the base classes and defined in the
337body.
338
339Special handling occurs if this ``def`` appears inside a ``multiclass`` or
340a ``foreach``.
341
342``defm``
343--------
344
345.. productionlist::
346 Defm: "defm" `TokIdentifier` ":" `BaseClassList` ";"
347
348Note that in the :token:`BaseClassList`, all of the ``multiclass``'s must
349precede any ``class``'s that appear.
350
351``foreach``
352-----------
353
354.. productionlist::
355 Foreach: "foreach" `Declaration` "in" "{" `Object`* "}"
356 :| "foreach" `Declaration` "in" `Object`
357
358The value assigned to the variable in the declaration is iterated over and
359the object or object list is reevaluated with the variable set at each
360iterated value.
361
362Top-Level ``let``
363-----------------
364
365.. productionlist::
366 Let: "let" `LetList` "in" "{" `Object`* "}"
367 :| "let" `LetList` "in" `Object`
368 LetList: `LetItem` ("," `LetItem`)*
369 LetItem: `TokIdentifier` [`RangeList`] "=" `Value`
370
371This is effectively equivalent to ``let`` inside the body of a record
372except that it applies to multiple records at a time. The bindings are
373applied at the end of parsing the base classes of a record.
374
375``multiclass``
376--------------
377
378.. productionlist::
379 MultiClass: "multiclass" `TokIdentifier` [`TemplateArgList`]
Sean Silva9302dcc2013-01-09 02:11:55 +0000380 : [":" `BaseMultiClassList`] "{" `MultiClassObject`+ "}"
Sean Silva26b8aab2013-01-07 02:43:44 +0000381 BaseMultiClassList: `MultiClassID` ("," `MultiClassID`)*
382 MultiClassID: `TokIdentifier`
Sean Silva9302dcc2013-01-09 02:11:55 +0000383 MultiClassObject: `Def` | `Defm` | `Let` | `Foreach`