Bill Wendling | bd96e0d | 2012-06-21 06:58:24 +0000 | [diff] [blame] | 1 | ===================== |
| 2 | TableGen Fundamentals |
| 3 | ===================== |
| 4 | |
| 5 | .. contents:: |
| 6 | :local: |
| 7 | |
| 8 | Introduction |
| 9 | ============ |
| 10 | |
| 11 | TableGen's purpose is to help a human develop and maintain records of |
| 12 | domain-specific information. Because there may be a large number of these |
| 13 | records, it is specifically designed to allow writing flexible descriptions and |
| 14 | for common features of these records to be factored out. This reduces the |
| 15 | amount of duplication in the description, reduces the chance of error, and makes |
| 16 | it easier to structure domain specific information. |
| 17 | |
| 18 | The core part of TableGen `parses a file`_, instantiates the declarations, and |
| 19 | hands the result off to a domain-specific `TableGen backend`_ for processing. |
| 20 | The current major user of TableGen is the `LLVM code |
| 21 | generator <CodeGenerator.html>`_. |
| 22 | |
| 23 | Note that if you work on TableGen much, and use emacs or vim, that you can find |
| 24 | an emacs "TableGen mode" and a vim language file in the ``llvm/utils/emacs`` and |
| 25 | ``llvm/utils/vim`` directories of your LLVM distribution, respectively. |
| 26 | |
| 27 | .. _intro: |
| 28 | |
| 29 | Basic concepts |
| 30 | -------------- |
| 31 | |
| 32 | TableGen files consist of two key parts: 'classes' and 'definitions', both of |
| 33 | which are considered 'records'. |
| 34 | |
| 35 | **TableGen records** have a unique name, a list of values, and a list of |
| 36 | superclasses. The list of values is the main data that TableGen builds for each |
| 37 | record; it is this that holds the domain specific information for the |
| 38 | application. The interpretation of this data is left to a specific `TableGen |
| 39 | backend`_, but the structure and format rules are taken care of and are fixed by |
| 40 | TableGen. |
| 41 | |
| 42 | **TableGen definitions** are the concrete form of 'records'. These generally do |
| 43 | not have any undefined values, and are marked with the '``def``' keyword. |
| 44 | |
| 45 | **TableGen classes** are abstract records that are used to build and describe |
| 46 | other records. These 'classes' allow the end-user to build abstractions for |
| 47 | either the domain they are targeting (such as "Register", "RegisterClass", and |
| 48 | "Instruction" in the LLVM code generator) or for the implementor to help factor |
| 49 | out common properties of records (such as "FPInst", which is used to represent |
| 50 | floating point instructions in the X86 backend). TableGen keeps track of all of |
| 51 | the classes that are used to build up a definition, so the backend can find all |
| 52 | definitions of a particular class, such as "Instruction". |
| 53 | |
| 54 | **TableGen multiclasses** are groups of abstract records that are instantiated |
| 55 | all at once. Each instantiation can result in multiple TableGen definitions. |
| 56 | If a multiclass inherits from another multiclass, the definitions in the |
| 57 | sub-multiclass become part of the current multiclass, as if they were declared |
| 58 | in the current multiclass. |
| 59 | |
| 60 | .. _described above: |
| 61 | |
| 62 | An example record |
| 63 | ----------------- |
| 64 | |
| 65 | With no other arguments, TableGen parses the specified file and prints out all |
| 66 | of the classes, then all of the definitions. This is a good way to see what the |
| 67 | various definitions expand to fully. Running this on the ``X86.td`` file prints |
| 68 | this (at the time of this writing): |
| 69 | |
| 70 | .. code-block:: llvm |
| 71 | |
| 72 | ... |
| 73 | def ADD32rr { // Instruction X86Inst I |
| 74 | string Namespace = "X86"; |
| 75 | dag OutOperandList = (outs GR32:$dst); |
| 76 | dag InOperandList = (ins GR32:$src1, GR32:$src2); |
| 77 | string AsmString = "add{l}\t{$src2, $dst|$dst, $src2}"; |
| 78 | list<dag> Pattern = [(set GR32:$dst, (add GR32:$src1, GR32:$src2))]; |
| 79 | list<Register> Uses = []; |
| 80 | list<Register> Defs = [EFLAGS]; |
| 81 | list<Predicate> Predicates = []; |
| 82 | int CodeSize = 3; |
| 83 | int AddedComplexity = 0; |
| 84 | bit isReturn = 0; |
| 85 | bit isBranch = 0; |
| 86 | bit isIndirectBranch = 0; |
| 87 | bit isBarrier = 0; |
| 88 | bit isCall = 0; |
| 89 | bit canFoldAsLoad = 0; |
| 90 | bit mayLoad = 0; |
| 91 | bit mayStore = 0; |
| 92 | bit isImplicitDef = 0; |
| 93 | bit isConvertibleToThreeAddress = 1; |
| 94 | bit isCommutable = 1; |
| 95 | bit isTerminator = 0; |
| 96 | bit isReMaterializable = 0; |
| 97 | bit isPredicable = 0; |
| 98 | bit hasDelaySlot = 0; |
| 99 | bit usesCustomInserter = 0; |
| 100 | bit hasCtrlDep = 0; |
| 101 | bit isNotDuplicable = 0; |
| 102 | bit hasSideEffects = 0; |
| 103 | bit neverHasSideEffects = 0; |
| 104 | InstrItinClass Itinerary = NoItinerary; |
| 105 | string Constraints = ""; |
| 106 | string DisableEncoding = ""; |
| 107 | bits<8> Opcode = { 0, 0, 0, 0, 0, 0, 0, 1 }; |
| 108 | Format Form = MRMDestReg; |
| 109 | bits<6> FormBits = { 0, 0, 0, 0, 1, 1 }; |
| 110 | ImmType ImmT = NoImm; |
| 111 | bits<3> ImmTypeBits = { 0, 0, 0 }; |
| 112 | bit hasOpSizePrefix = 0; |
| 113 | bit hasAdSizePrefix = 0; |
| 114 | bits<4> Prefix = { 0, 0, 0, 0 }; |
| 115 | bit hasREX_WPrefix = 0; |
| 116 | FPFormat FPForm = ?; |
| 117 | bits<3> FPFormBits = { 0, 0, 0 }; |
| 118 | } |
| 119 | ... |
| 120 | |
Eli Bendersky | 099bfe6 | 2012-11-20 19:37:58 +0000 | [diff] [blame] | 121 | This definition corresponds to the 32-bit register-register ``add`` instruction |
Eli Bendersky | 7c88270 | 2013-01-04 19:09:15 +0000 | [diff] [blame] | 122 | of the x86 architecture. ``def ADD32rr`` defines a record named |
Eli Bendersky | 099bfe6 | 2012-11-20 19:37:58 +0000 | [diff] [blame] | 123 | ``ADD32rr``, and the comment at the end of the line indicates the superclasses |
| 124 | of the definition. The body of the record contains all of the data that |
| 125 | TableGen assembled for the record, indicating that the instruction is part of |
Eli Bendersky | 7c88270 | 2013-01-04 19:09:15 +0000 | [diff] [blame] | 126 | the "X86" namespace, the pattern indicating how the instruction should be |
Eli Bendersky | 099bfe6 | 2012-11-20 19:37:58 +0000 | [diff] [blame] | 127 | emitted into the assembly file, that it is a two-address instruction, has a |
| 128 | particular encoding, etc. The contents and semantics of the information in the |
| 129 | record are specific to the needs of the X86 backend, and are only shown as an |
| 130 | example. |
Bill Wendling | bd96e0d | 2012-06-21 06:58:24 +0000 | [diff] [blame] | 131 | |
| 132 | As you can see, a lot of information is needed for every instruction supported |
| 133 | by the code generator, and specifying it all manually would be unmaintainable, |
| 134 | prone to bugs, and tiring to do in the first place. Because we are using |
| 135 | TableGen, all of the information was derived from the following definition: |
| 136 | |
| 137 | .. code-block:: llvm |
| 138 | |
| 139 | let Defs = [EFLAGS], |
| 140 | isCommutable = 1, // X = ADD Y,Z --> X = ADD Z,Y |
| 141 | isConvertibleToThreeAddress = 1 in // Can transform into LEA. |
| 142 | def ADD32rr : I<0x01, MRMDestReg, (outs GR32:$dst), |
| 143 | (ins GR32:$src1, GR32:$src2), |
| 144 | "add{l}\t{$src2, $dst|$dst, $src2}", |
| 145 | [(set GR32:$dst, (add GR32:$src1, GR32:$src2))]>; |
| 146 | |
| 147 | This definition makes use of the custom class ``I`` (extended from the custom |
| 148 | class ``X86Inst``), which is defined in the X86-specific TableGen file, to |
| 149 | factor out the common features that instructions of its class share. A key |
| 150 | feature of TableGen is that it allows the end-user to define the abstractions |
| 151 | they prefer to use when describing their information. |
| 152 | |
Eli Bendersky | 099bfe6 | 2012-11-20 19:37:58 +0000 | [diff] [blame] | 153 | Each ``def`` record has a special entry called "NAME". This is the name of the |
| 154 | record ("``ADD32rr``" above). In the general case ``def`` names can be formed |
| 155 | from various kinds of string processing expressions and ``NAME`` resolves to the |
Bill Wendling | bd96e0d | 2012-06-21 06:58:24 +0000 | [diff] [blame] | 156 | final value obtained after resolving all of those expressions. The user may |
Eli Bendersky | 099bfe6 | 2012-11-20 19:37:58 +0000 | [diff] [blame] | 157 | refer to ``NAME`` anywhere she desires to use the ultimate name of the ``def``. |
| 158 | ``NAME`` should not be defined anywhere else in user code to avoid conflicts. |
Bill Wendling | bd96e0d | 2012-06-21 06:58:24 +0000 | [diff] [blame] | 159 | |
| 160 | Running TableGen |
| 161 | ---------------- |
| 162 | |
| 163 | TableGen runs just like any other LLVM tool. The first (optional) argument |
| 164 | specifies the file to read. If a filename is not specified, ``llvm-tblgen`` |
| 165 | reads from standard input. |
| 166 | |
| 167 | To be useful, one of the `TableGen backends`_ must be used. These backends are |
| 168 | selectable on the command line (type '``llvm-tblgen -help``' for a list). For |
| 169 | example, to get a list of all of the definitions that subclass a particular type |
| 170 | (which can be useful for building up an enum list of these records), use the |
| 171 | ``-print-enums`` option: |
| 172 | |
| 173 | .. code-block:: bash |
| 174 | |
| 175 | $ llvm-tblgen X86.td -print-enums -class=Register |
| 176 | AH, AL, AX, BH, BL, BP, BPL, BX, CH, CL, CX, DH, DI, DIL, DL, DX, EAX, EBP, EBX, |
| 177 | ECX, EDI, EDX, EFLAGS, EIP, ESI, ESP, FP0, FP1, FP2, FP3, FP4, FP5, FP6, IP, |
| 178 | MM0, MM1, MM2, MM3, MM4, MM5, MM6, MM7, R10, R10B, R10D, R10W, R11, R11B, R11D, |
| 179 | R11W, R12, R12B, R12D, R12W, R13, R13B, R13D, R13W, R14, R14B, R14D, R14W, R15, |
| 180 | R15B, R15D, R15W, R8, R8B, R8D, R8W, R9, R9B, R9D, R9W, RAX, RBP, RBX, RCX, RDI, |
| 181 | RDX, RIP, RSI, RSP, SI, SIL, SP, SPL, ST0, ST1, ST2, ST3, ST4, ST5, ST6, ST7, |
| 182 | XMM0, XMM1, XMM10, XMM11, XMM12, XMM13, XMM14, XMM15, XMM2, XMM3, XMM4, XMM5, |
| 183 | XMM6, XMM7, XMM8, XMM9, |
| 184 | |
| 185 | $ llvm-tblgen X86.td -print-enums -class=Instruction |
| 186 | ABS_F, ABS_Fp32, ABS_Fp64, ABS_Fp80, ADC32mi, ADC32mi8, ADC32mr, ADC32ri, |
| 187 | ADC32ri8, ADC32rm, ADC32rr, ADC64mi32, ADC64mi8, ADC64mr, ADC64ri32, ADC64ri8, |
| 188 | ADC64rm, ADC64rr, ADD16mi, ADD16mi8, ADD16mr, ADD16ri, ADD16ri8, ADD16rm, |
| 189 | ADD16rr, ADD32mi, ADD32mi8, ADD32mr, ADD32ri, ADD32ri8, ADD32rm, ADD32rr, |
| 190 | ADD64mi32, ADD64mi8, ADD64mr, ADD64ri32, ... |
| 191 | |
| 192 | The default backend prints out all of the records, as `described above`_. |
| 193 | |
| 194 | If you plan to use TableGen, you will most likely have to `write a backend`_ |
| 195 | that extracts the information specific to what you need and formats it in the |
| 196 | appropriate way. |
| 197 | |
| 198 | .. _parses a file: |
| 199 | |
| 200 | TableGen syntax |
| 201 | =============== |
| 202 | |
| 203 | TableGen doesn't care about the meaning of data (that is up to the backend to |
| 204 | define), but it does care about syntax, and it enforces a simple type system. |
| 205 | This section describes the syntax and the constructs allowed in a TableGen file. |
| 206 | |
| 207 | TableGen primitives |
| 208 | ------------------- |
| 209 | |
| 210 | TableGen comments |
| 211 | ^^^^^^^^^^^^^^^^^ |
| 212 | |
| 213 | TableGen supports BCPL style "``//``" comments, which run to the end of the |
| 214 | line, and it also supports **nestable** "``/* */``" comments. |
| 215 | |
| 216 | .. _TableGen type: |
| 217 | |
| 218 | The TableGen type system |
| 219 | ^^^^^^^^^^^^^^^^^^^^^^^^ |
| 220 | |
| 221 | TableGen files are strongly typed, in a simple (but complete) type-system. |
| 222 | These types are used to perform automatic conversions, check for errors, and to |
| 223 | help interface designers constrain the input that they allow. Every `value |
| 224 | definition`_ is required to have an associated type. |
| 225 | |
| 226 | TableGen supports a mixture of very low-level types (such as ``bit``) and very |
| 227 | high-level types (such as ``dag``). This flexibility is what allows it to |
| 228 | describe a wide range of information conveniently and compactly. The TableGen |
| 229 | types are: |
| 230 | |
| 231 | ``bit`` |
| 232 | A 'bit' is a boolean value that can hold either 0 or 1. |
| 233 | |
| 234 | ``int`` |
| 235 | The 'int' type represents a simple 32-bit integer value, such as 5. |
| 236 | |
| 237 | ``string`` |
| 238 | The 'string' type represents an ordered sequence of characters of arbitrary |
| 239 | length. |
| 240 | |
| 241 | ``bits<n>`` |
| 242 | A 'bits' type is an arbitrary, but fixed, size integer that is broken up |
| 243 | into individual bits. This type is useful because it can handle some bits |
| 244 | being defined while others are undefined. |
| 245 | |
| 246 | ``list<ty>`` |
| 247 | This type represents a list whose elements are some other type. The |
| 248 | contained type is arbitrary: it can even be another list type. |
| 249 | |
| 250 | Class type |
| 251 | Specifying a class name in a type context means that the defined value must |
| 252 | be a subclass of the specified class. This is useful in conjunction with |
Bill Wendling | 09d3233 | 2012-06-21 07:01:02 +0000 | [diff] [blame] | 253 | the ``list`` type, for example, to constrain the elements of the list to a |
| 254 | common base class (e.g., a ``list<Register>`` can only contain definitions |
| 255 | derived from the "``Register``" class). |
Bill Wendling | bd96e0d | 2012-06-21 06:58:24 +0000 | [diff] [blame] | 256 | |
| 257 | ``dag`` |
| 258 | This type represents a nestable directed graph of elements. |
| 259 | |
| 260 | ``code`` |
| 261 | This represents a big hunk of text. This is lexically distinct from string |
| 262 | values because it doesn't require escaping double quotes and other common |
| 263 | characters that occur in code. |
| 264 | |
| 265 | To date, these types have been sufficient for describing things that TableGen |
| 266 | has been used for, but it is straight-forward to extend this list if needed. |
| 267 | |
| 268 | .. _TableGen expressions: |
| 269 | |
| 270 | TableGen values and expressions |
| 271 | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ |
| 272 | |
| 273 | TableGen allows for a pretty reasonable number of different expression forms |
| 274 | when building up values. These forms allow the TableGen file to be written in a |
| 275 | natural syntax and flavor for the application. The current expression forms |
| 276 | supported include: |
| 277 | |
| 278 | ``?`` |
| 279 | uninitialized field |
| 280 | |
| 281 | ``0b1001011`` |
| 282 | binary integer value |
| 283 | |
| 284 | ``07654321`` |
| 285 | octal integer value (indicated by a leading 0) |
| 286 | |
| 287 | ``7`` |
| 288 | decimal integer value |
| 289 | |
| 290 | ``0x7F`` |
| 291 | hexadecimal integer value |
| 292 | |
| 293 | ``"foo"`` |
| 294 | string value |
| 295 | |
| 296 | ``[{ ... }]`` |
| 297 | code fragment |
| 298 | |
| 299 | ``[ X, Y, Z ]<type>`` |
| 300 | list value. <type> is the type of the list element and is usually optional. |
| 301 | In rare cases, TableGen is unable to deduce the element type in which case |
| 302 | the user must specify it explicitly. |
| 303 | |
| 304 | ``{ a, b, c }`` |
| 305 | initializer for a "bits<3>" value |
| 306 | |
| 307 | ``value`` |
| 308 | value reference |
| 309 | |
| 310 | ``value{17}`` |
| 311 | access to one bit of a value |
| 312 | |
| 313 | ``value{15-17}`` |
| 314 | access to multiple bits of a value |
| 315 | |
| 316 | ``DEF`` |
| 317 | reference to a record definition |
| 318 | |
| 319 | ``CLASS<val list>`` |
| 320 | reference to a new anonymous definition of CLASS with the specified template |
| 321 | arguments. |
| 322 | |
| 323 | ``X.Y`` |
| 324 | reference to the subfield of a value |
| 325 | |
| 326 | ``list[4-7,17,2-3]`` |
| 327 | A slice of the 'list' list, including elements 4,5,6,7,17,2, and 3 from it. |
| 328 | Elements may be included multiple times. |
| 329 | |
| 330 | ``foreach <var> = [ <list> ] in { <body> }`` |
| 331 | |
| 332 | ``foreach <var> = [ <list> ] in <def>`` |
| 333 | Replicate <body> or <def>, replacing instances of <var> with each value |
| 334 | in <list>. <var> is scoped at the level of the ``foreach`` loop and must |
| 335 | not conflict with any other object introduced in <body> or <def>. Currently |
| 336 | only ``def``\s are expanded within <body>. |
| 337 | |
| 338 | ``foreach <var> = 0-15 in ...`` |
| 339 | |
| 340 | ``foreach <var> = {0-15,32-47} in ...`` |
| 341 | Loop over ranges of integers. The braces are required for multiple ranges. |
| 342 | |
| 343 | ``(DEF a, b)`` |
| 344 | a dag value. The first element is required to be a record definition, the |
| 345 | remaining elements in the list may be arbitrary other values, including |
| 346 | nested ```dag``' values. |
| 347 | |
| 348 | ``!strconcat(a, b)`` |
| 349 | A string value that is the result of concatenating the 'a' and 'b' strings. |
| 350 | |
| 351 | ``str1#str2`` |
| 352 | "#" (paste) is a shorthand for !strconcat. It may concatenate things that |
| 353 | are not quoted strings, in which case an implicit !cast<string> is done on |
Bill Wendling | 09d3233 | 2012-06-21 07:01:02 +0000 | [diff] [blame] | 354 | the operand of the paste. |
Bill Wendling | bd96e0d | 2012-06-21 06:58:24 +0000 | [diff] [blame] | 355 | |
| 356 | ``!cast<type>(a)`` |
| 357 | A symbol of type *type* obtained by looking up the string 'a' in the symbol |
| 358 | table. If the type of 'a' does not match *type*, TableGen aborts with an |
| 359 | error. !cast<string> is a special case in that the argument must be an |
Bill Wendling | 09d3233 | 2012-06-21 07:01:02 +0000 | [diff] [blame] | 360 | object defined by a 'def' construct. |
Bill Wendling | bd96e0d | 2012-06-21 06:58:24 +0000 | [diff] [blame] | 361 | |
| 362 | ``!subst(a, b, c)`` |
| 363 | If 'a' and 'b' are of string type or are symbol references, substitute 'b' |
| 364 | for 'a' in 'c.' This operation is analogous to $(subst) in GNU make. |
| 365 | |
| 366 | ``!foreach(a, b, c)`` |
| 367 | For each member 'b' of dag or list 'a' apply operator 'c.' 'b' is a dummy |
| 368 | variable that should be declared as a member variable of an instantiated |
| 369 | class. This operation is analogous to $(foreach) in GNU make. |
| 370 | |
| 371 | ``!head(a)`` |
| 372 | The first element of list 'a.' |
| 373 | |
| 374 | ``!tail(a)`` |
| 375 | The 2nd-N elements of list 'a.' |
| 376 | |
| 377 | ``!empty(a)`` |
| 378 | An integer {0,1} indicating whether list 'a' is empty. |
| 379 | |
| 380 | ``!if(a,b,c)`` |
| 381 | 'b' if the result of 'int' or 'bit' operator 'a' is nonzero, 'c' otherwise. |
| 382 | |
| 383 | ``!eq(a,b)`` |
| 384 | 'bit 1' if string a is equal to string b, 0 otherwise. This only operates |
| 385 | on string, int and bit objects. Use !cast<string> to compare other types of |
| 386 | objects. |
| 387 | |
| 388 | Note that all of the values have rules specifying how they convert to values |
| 389 | for different types. These rules allow you to assign a value like "``7``" |
| 390 | to a "``bits<4>``" value, for example. |
| 391 | |
| 392 | Classes and definitions |
| 393 | ----------------------- |
| 394 | |
| 395 | As mentioned in the `intro`_, classes and definitions (collectively known as |
| 396 | 'records') in TableGen are the main high-level unit of information that TableGen |
| 397 | collects. Records are defined with a ``def`` or ``class`` keyword, the record |
| 398 | name, and an optional list of "`template arguments`_". If the record has |
| 399 | superclasses, they are specified as a comma separated list that starts with a |
| 400 | colon character ("``:``"). If `value definitions`_ or `let expressions`_ are |
| 401 | needed for the class, they are enclosed in curly braces ("``{}``"); otherwise, |
| 402 | the record ends with a semicolon. |
| 403 | |
| 404 | Here is a simple TableGen file: |
| 405 | |
| 406 | .. code-block:: llvm |
| 407 | |
| 408 | class C { bit V = 1; } |
| 409 | def X : C; |
| 410 | def Y : C { |
| 411 | string Greeting = "hello"; |
| 412 | } |
| 413 | |
| 414 | This example defines two definitions, ``X`` and ``Y``, both of which derive from |
| 415 | the ``C`` class. Because of this, they both get the ``V`` bit value. The ``Y`` |
| 416 | definition also gets the Greeting member as well. |
| 417 | |
| 418 | In general, classes are useful for collecting together the commonality between a |
| 419 | group of records and isolating it in a single place. Also, classes permit the |
| 420 | specification of default values for their subclasses, allowing the subclasses to |
| 421 | override them as they wish. |
| 422 | |
| 423 | .. _value definition: |
| 424 | .. _value definitions: |
| 425 | |
| 426 | Value definitions |
| 427 | ^^^^^^^^^^^^^^^^^ |
| 428 | |
| 429 | Value definitions define named entries in records. A value must be defined |
| 430 | before it can be referred to as the operand for another value definition or |
| 431 | before the value is reset with a `let expression`_. A value is defined by |
| 432 | specifying a `TableGen type`_ and a name. If an initial value is available, it |
| 433 | may be specified after the type with an equal sign. Value definitions require |
| 434 | terminating semicolons. |
| 435 | |
| 436 | .. _let expression: |
| 437 | .. _let expressions: |
| 438 | .. _"let" expressions within a record: |
| 439 | |
| 440 | 'let' expressions |
| 441 | ^^^^^^^^^^^^^^^^^ |
| 442 | |
| 443 | A record-level let expression is used to change the value of a value definition |
| 444 | in a record. This is primarily useful when a superclass defines a value that a |
| 445 | derived class or definition wants to override. Let expressions consist of the |
| 446 | '``let``' keyword followed by a value name, an equal sign ("``=``"), and a new |
| 447 | value. For example, a new class could be added to the example above, redefining |
| 448 | the ``V`` field for all of its subclasses: |
| 449 | |
| 450 | .. code-block:: llvm |
| 451 | |
| 452 | class D : C { let V = 0; } |
| 453 | def Z : D; |
| 454 | |
| 455 | In this case, the ``Z`` definition will have a zero value for its ``V`` value, |
| 456 | despite the fact that it derives (indirectly) from the ``C`` class, because the |
| 457 | ``D`` class overrode its value. |
| 458 | |
| 459 | .. _template arguments: |
| 460 | |
| 461 | Class template arguments |
| 462 | ^^^^^^^^^^^^^^^^^^^^^^^^ |
| 463 | |
| 464 | TableGen permits the definition of parameterized classes as well as normal |
| 465 | concrete classes. Parameterized TableGen classes specify a list of variable |
| 466 | bindings (which may optionally have defaults) that are bound when used. Here is |
| 467 | a simple example: |
| 468 | |
| 469 | .. code-block:: llvm |
| 470 | |
| 471 | class FPFormat<bits<3> val> { |
| 472 | bits<3> Value = val; |
| 473 | } |
| 474 | def NotFP : FPFormat<0>; |
| 475 | def ZeroArgFP : FPFormat<1>; |
| 476 | def OneArgFP : FPFormat<2>; |
| 477 | def OneArgFPRW : FPFormat<3>; |
| 478 | def TwoArgFP : FPFormat<4>; |
| 479 | def CompareFP : FPFormat<5>; |
| 480 | def CondMovFP : FPFormat<6>; |
| 481 | def SpecialFP : FPFormat<7>; |
| 482 | |
| 483 | In this case, template arguments are used as a space efficient way to specify a |
| 484 | list of "enumeration values", each with a "``Value``" field set to the specified |
| 485 | integer. |
| 486 | |
| 487 | The more esoteric forms of `TableGen expressions`_ are useful in conjunction |
| 488 | with template arguments. As an example: |
| 489 | |
| 490 | .. code-block:: llvm |
| 491 | |
| 492 | class ModRefVal<bits<2> val> { |
| 493 | bits<2> Value = val; |
| 494 | } |
| 495 | |
| 496 | def None : ModRefVal<0>; |
| 497 | def Mod : ModRefVal<1>; |
| 498 | def Ref : ModRefVal<2>; |
| 499 | def ModRef : ModRefVal<3>; |
| 500 | |
| 501 | class Value<ModRefVal MR> { |
| 502 | // Decode some information into a more convenient format, while providing |
| 503 | // a nice interface to the user of the "Value" class. |
| 504 | bit isMod = MR.Value{0}; |
| 505 | bit isRef = MR.Value{1}; |
| 506 | |
| 507 | // other stuff... |
| 508 | } |
| 509 | |
| 510 | // Example uses |
| 511 | def bork : Value<Mod>; |
| 512 | def zork : Value<Ref>; |
| 513 | def hork : Value<ModRef>; |
| 514 | |
| 515 | This is obviously a contrived example, but it shows how template arguments can |
| 516 | be used to decouple the interface provided to the user of the class from the |
| 517 | actual internal data representation expected by the class. In this case, |
| 518 | running ``llvm-tblgen`` on the example prints the following definitions: |
| 519 | |
| 520 | .. code-block:: llvm |
| 521 | |
| 522 | def bork { // Value |
| 523 | bit isMod = 1; |
| 524 | bit isRef = 0; |
| 525 | } |
| 526 | def hork { // Value |
| 527 | bit isMod = 1; |
| 528 | bit isRef = 1; |
| 529 | } |
| 530 | def zork { // Value |
| 531 | bit isMod = 0; |
| 532 | bit isRef = 1; |
| 533 | } |
| 534 | |
| 535 | This shows that TableGen was able to dig into the argument and extract a piece |
| 536 | of information that was requested by the designer of the "Value" class. For |
| 537 | more realistic examples, please see existing users of TableGen, such as the X86 |
| 538 | backend. |
| 539 | |
| 540 | Multiclass definitions and instances |
| 541 | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ |
| 542 | |
| 543 | While classes with template arguments are a good way to factor commonality |
| 544 | between two instances of a definition, multiclasses allow a convenient notation |
| 545 | for defining multiple definitions at once (instances of implicitly constructed |
| 546 | classes). For example, consider an 3-address instruction set whose instructions |
| 547 | come in two forms: "``reg = reg op reg``" and "``reg = reg op imm``" |
| 548 | (e.g. SPARC). In this case, you'd like to specify in one place that this |
| 549 | commonality exists, then in a separate place indicate what all the ops are. |
| 550 | |
| 551 | Here is an example TableGen fragment that shows this idea: |
| 552 | |
| 553 | .. code-block:: llvm |
| 554 | |
| 555 | def ops; |
| 556 | def GPR; |
| 557 | def Imm; |
| 558 | class inst<int opc, string asmstr, dag operandlist>; |
| 559 | |
| 560 | multiclass ri_inst<int opc, string asmstr> { |
| 561 | def _rr : inst<opc, !strconcat(asmstr, " $dst, $src1, $src2"), |
| 562 | (ops GPR:$dst, GPR:$src1, GPR:$src2)>; |
| 563 | def _ri : inst<opc, !strconcat(asmstr, " $dst, $src1, $src2"), |
| 564 | (ops GPR:$dst, GPR:$src1, Imm:$src2)>; |
| 565 | } |
| 566 | |
| 567 | // Instantiations of the ri_inst multiclass. |
| 568 | defm ADD : ri_inst<0b111, "add">; |
| 569 | defm SUB : ri_inst<0b101, "sub">; |
| 570 | defm MUL : ri_inst<0b100, "mul">; |
| 571 | ... |
| 572 | |
| 573 | The name of the resultant definitions has the multidef fragment names appended |
| 574 | to them, so this defines ``ADD_rr``, ``ADD_ri``, ``SUB_rr``, etc. A defm may |
| 575 | inherit from multiple multiclasses, instantiating definitions from each |
| 576 | multiclass. Using a multiclass this way is exactly equivalent to instantiating |
| 577 | the classes multiple times yourself, e.g. by writing: |
| 578 | |
| 579 | .. code-block:: llvm |
| 580 | |
| 581 | def ops; |
| 582 | def GPR; |
| 583 | def Imm; |
| 584 | class inst<int opc, string asmstr, dag operandlist>; |
| 585 | |
| 586 | class rrinst<int opc, string asmstr> |
| 587 | : inst<opc, !strconcat(asmstr, " $dst, $src1, $src2"), |
| 588 | (ops GPR:$dst, GPR:$src1, GPR:$src2)>; |
| 589 | |
| 590 | class riinst<int opc, string asmstr> |
| 591 | : inst<opc, !strconcat(asmstr, " $dst, $src1, $src2"), |
| 592 | (ops GPR:$dst, GPR:$src1, Imm:$src2)>; |
| 593 | |
| 594 | // Instantiations of the ri_inst multiclass. |
| 595 | def ADD_rr : rrinst<0b111, "add">; |
| 596 | def ADD_ri : riinst<0b111, "add">; |
| 597 | def SUB_rr : rrinst<0b101, "sub">; |
| 598 | def SUB_ri : riinst<0b101, "sub">; |
| 599 | def MUL_rr : rrinst<0b100, "mul">; |
| 600 | def MUL_ri : riinst<0b100, "mul">; |
| 601 | ... |
| 602 | |
| 603 | A ``defm`` can also be used inside a multiclass providing several levels of |
| 604 | multiclass instanciations. |
| 605 | |
| 606 | .. code-block:: llvm |
| 607 | |
| 608 | class Instruction<bits<4> opc, string Name> { |
| 609 | bits<4> opcode = opc; |
| 610 | string name = Name; |
| 611 | } |
| 612 | |
| 613 | multiclass basic_r<bits<4> opc> { |
| 614 | def rr : Instruction<opc, "rr">; |
| 615 | def rm : Instruction<opc, "rm">; |
| 616 | } |
| 617 | |
| 618 | multiclass basic_s<bits<4> opc> { |
| 619 | defm SS : basic_r<opc>; |
| 620 | defm SD : basic_r<opc>; |
| 621 | def X : Instruction<opc, "x">; |
| 622 | } |
| 623 | |
| 624 | multiclass basic_p<bits<4> opc> { |
| 625 | defm PS : basic_r<opc>; |
| 626 | defm PD : basic_r<opc>; |
| 627 | def Y : Instruction<opc, "y">; |
| 628 | } |
| 629 | |
| 630 | defm ADD : basic_s<0xf>, basic_p<0xf>; |
| 631 | ... |
| 632 | |
| 633 | // Results |
| 634 | def ADDPDrm { ... |
| 635 | def ADDPDrr { ... |
| 636 | def ADDPSrm { ... |
| 637 | def ADDPSrr { ... |
| 638 | def ADDSDrm { ... |
| 639 | def ADDSDrr { ... |
| 640 | def ADDY { ... |
| 641 | def ADDX { ... |
| 642 | |
| 643 | ``defm`` declarations can inherit from classes too, the rule to follow is that |
| 644 | the class list must start after the last multiclass, and there must be at least |
| 645 | one multiclass before them. |
| 646 | |
| 647 | .. code-block:: llvm |
| 648 | |
| 649 | class XD { bits<4> Prefix = 11; } |
| 650 | class XS { bits<4> Prefix = 12; } |
| 651 | |
| 652 | class I<bits<4> op> { |
| 653 | bits<4> opcode = op; |
| 654 | } |
| 655 | |
| 656 | multiclass R { |
| 657 | def rr : I<4>; |
| 658 | def rm : I<2>; |
| 659 | } |
| 660 | |
| 661 | multiclass Y { |
| 662 | defm SS : R, XD; |
| 663 | defm SD : R, XS; |
| 664 | } |
| 665 | |
| 666 | defm Instr : Y; |
| 667 | |
| 668 | // Results |
| 669 | def InstrSDrm { |
| 670 | bits<4> opcode = { 0, 0, 1, 0 }; |
| 671 | bits<4> Prefix = { 1, 1, 0, 0 }; |
| 672 | } |
| 673 | ... |
| 674 | def InstrSSrr { |
| 675 | bits<4> opcode = { 0, 1, 0, 0 }; |
| 676 | bits<4> Prefix = { 1, 0, 1, 1 }; |
| 677 | } |
| 678 | |
| 679 | File scope entities |
| 680 | ------------------- |
| 681 | |
| 682 | File inclusion |
| 683 | ^^^^^^^^^^^^^^ |
| 684 | |
| 685 | TableGen supports the '``include``' token, which textually substitutes the |
| 686 | specified file in place of the include directive. The filename should be |
| 687 | specified as a double quoted string immediately after the '``include``' keyword. |
| 688 | Example: |
| 689 | |
| 690 | .. code-block:: llvm |
| 691 | |
| 692 | include "foo.td" |
| 693 | |
| 694 | 'let' expressions |
| 695 | ^^^^^^^^^^^^^^^^^ |
| 696 | |
| 697 | "Let" expressions at file scope are similar to `"let" expressions within a |
| 698 | record`_, except they can specify a value binding for multiple records at a |
| 699 | time, and may be useful in certain other cases. File-scope let expressions are |
| 700 | really just another way that TableGen allows the end-user to factor out |
| 701 | commonality from the records. |
| 702 | |
| 703 | File-scope "let" expressions take a comma-separated list of bindings to apply, |
| 704 | and one or more records to bind the values in. Here are some examples: |
| 705 | |
| 706 | .. code-block:: llvm |
| 707 | |
| 708 | let isTerminator = 1, isReturn = 1, isBarrier = 1, hasCtrlDep = 1 in |
| 709 | def RET : I<0xC3, RawFrm, (outs), (ins), "ret", [(X86retflag 0)]>; |
| 710 | |
| 711 | let isCall = 1 in |
| 712 | // All calls clobber the non-callee saved registers... |
| 713 | let Defs = [EAX, ECX, EDX, FP0, FP1, FP2, FP3, FP4, FP5, FP6, ST0, |
| 714 | MM0, MM1, MM2, MM3, MM4, MM5, MM6, MM7, |
| 715 | XMM0, XMM1, XMM2, XMM3, XMM4, XMM5, XMM6, XMM7, EFLAGS] in { |
| 716 | def CALLpcrel32 : Ii32<0xE8, RawFrm, (outs), (ins i32imm:$dst,variable_ops), |
| 717 | "call\t${dst:call}", []>; |
| 718 | def CALL32r : I<0xFF, MRM2r, (outs), (ins GR32:$dst, variable_ops), |
| 719 | "call\t{*}$dst", [(X86call GR32:$dst)]>; |
| 720 | def CALL32m : I<0xFF, MRM2m, (outs), (ins i32mem:$dst, variable_ops), |
| 721 | "call\t{*}$dst", []>; |
| 722 | } |
| 723 | |
| 724 | File-scope "let" expressions are often useful when a couple of definitions need |
| 725 | to be added to several records, and the records do not otherwise need to be |
| 726 | opened, as in the case with the ``CALL*`` instructions above. |
| 727 | |
| 728 | It's also possible to use "let" expressions inside multiclasses, providing more |
| 729 | ways to factor out commonality from the records, specially if using several |
| 730 | levels of multiclass instanciations. This also avoids the need of using "let" |
| 731 | expressions within subsequent records inside a multiclass. |
| 732 | |
| 733 | .. code-block:: llvm |
| 734 | |
| 735 | multiclass basic_r<bits<4> opc> { |
| 736 | let Predicates = [HasSSE2] in { |
| 737 | def rr : Instruction<opc, "rr">; |
| 738 | def rm : Instruction<opc, "rm">; |
| 739 | } |
| 740 | let Predicates = [HasSSE3] in |
| 741 | def rx : Instruction<opc, "rx">; |
| 742 | } |
| 743 | |
| 744 | multiclass basic_ss<bits<4> opc> { |
| 745 | let IsDouble = 0 in |
| 746 | defm SS : basic_r<opc>; |
| 747 | |
| 748 | let IsDouble = 1 in |
| 749 | defm SD : basic_r<opc>; |
| 750 | } |
| 751 | |
| 752 | defm ADD : basic_ss<0xf>; |
| 753 | |
| 754 | Looping |
| 755 | ^^^^^^^ |
| 756 | |
| 757 | TableGen supports the '``foreach``' block, which textually replicates the loop |
| 758 | body, substituting iterator values for iterator references in the body. |
| 759 | Example: |
| 760 | |
| 761 | .. code-block:: llvm |
| 762 | |
| 763 | foreach i = [0, 1, 2, 3] in { |
| 764 | def R#i : Register<...>; |
| 765 | def F#i : Register<...>; |
| 766 | } |
| 767 | |
| 768 | This will create objects ``R0``, ``R1``, ``R2`` and ``R3``. ``foreach`` blocks |
| 769 | may be nested. If there is only one item in the body the braces may be |
| 770 | elided: |
| 771 | |
| 772 | .. code-block:: llvm |
| 773 | |
| 774 | foreach i = [0, 1, 2, 3] in |
| 775 | def R#i : Register<...>; |
| 776 | |
| 777 | Code Generator backend info |
| 778 | =========================== |
| 779 | |
| 780 | Expressions used by code generator to describe instructions and isel patterns: |
| 781 | |
| 782 | ``(implicit a)`` |
| 783 | an implicitly defined physical register. This tells the dag instruction |
| 784 | selection emitter the input pattern's extra definitions matches implicit |
| 785 | physical register definitions. |
| 786 | |
| 787 | .. _TableGen backend: |
| 788 | .. _TableGen backends: |
| 789 | .. _write a backend: |
| 790 | |
| 791 | TableGen backends |
| 792 | ================= |
| 793 | |
Sean Silva | e70de1f | 2013-01-30 20:39:46 +0000 | [diff] [blame] | 794 | Until we get a step-by-step HowTo for writing TableGen backends, you can at |
| 795 | least grab the boilerplate (build system, new files, etc.) from Clang's |
| 796 | r173931. |
| 797 | |
Bill Wendling | bd96e0d | 2012-06-21 06:58:24 +0000 | [diff] [blame] | 798 | TODO: How they work, how to write one. This section should not contain details |
| 799 | about any particular backend, except maybe ``-print-enums`` as an example. This |
| 800 | should highlight the APIs in ``TableGen/Record.h``. |