blob: ca6c0107fa66e6d1f89ea60de6514264c268ced4 [file] [log] [blame]
Alex Lorenz3d311772015-08-06 22:55:19 +00001========================================
2Machine IR (MIR) Format Reference Manual
3========================================
4
5.. contents::
6 :local:
7
8.. warning::
9 This is a work in progress.
10
11Introduction
12============
13
14This document is a reference manual for the Machine IR (MIR) serialization
15format. MIR is a human readable serialization format that is used to represent
16LLVM's :ref:`machine specific intermediate representation
17<machine code representation>`.
18
19The MIR serialization format is designed to be used for testing the code
20generation passes in LLVM.
21
22Overview
23========
24
25The MIR serialization format uses a YAML container. YAML is a standard
26data serialization language, and the full YAML language spec can be read at
27`yaml.org
28<http://www.yaml.org/spec/1.2/spec.html#Introduction>`_.
29
30A MIR file is split up into a series of `YAML documents`_. The first document
31can contain an optional embedded LLVM IR module, and the rest of the documents
32contain the serialized machine functions.
33
34.. _YAML documents: http://www.yaml.org/spec/1.2/spec.html#id2800132
35
Alex Lorenzea788c42015-08-21 22:58:33 +000036MIR Testing Guide
37=================
38
39You can use the MIR format for testing in two different ways:
40
41- You can write MIR tests that invoke a single code generation pass using the
Matthias Braune6185b72017-04-13 22:14:45 +000042 ``-run-pass`` option in llc.
Alex Lorenzea788c42015-08-21 22:58:33 +000043
Matthias Braune6185b72017-04-13 22:14:45 +000044- You can use llc's ``-stop-after`` option with existing or new LLVM assembly
Alex Lorenzea788c42015-08-21 22:58:33 +000045 tests and check the MIR output of a specific code generation pass.
46
47Testing Individual Code Generation Passes
48-----------------------------------------
49
Matthias Braune6185b72017-04-13 22:14:45 +000050The ``-run-pass`` option in llc allows you to create MIR tests that invoke just
51a single code generation pass. When this option is used, llc will parse an
52input MIR file, run the specified code generation pass(es), and output the
53resulting MIR code.
Alex Lorenzea788c42015-08-21 22:58:33 +000054
Matthias Braune6185b72017-04-13 22:14:45 +000055You can generate an input MIR file for the test by using the ``-stop-after`` or
56``-stop-before`` option in llc. For example, if you would like to write a test
57for the post register allocation pseudo instruction expansion pass, you can
58specify the machine copy propagation pass in the ``-stop-after`` option, as it
59runs just before the pass that we are trying to test:
Alex Lorenzea788c42015-08-21 22:58:33 +000060
Matthias Braune6185b72017-04-13 22:14:45 +000061 ``llc -stop-after=machine-cp bug-trigger.ll > test.mir``
Alex Lorenzea788c42015-08-21 22:58:33 +000062
Matt Arsenault43153022018-12-04 17:45:12 +000063If the same pass is run multiple times, a run index can be included
64after the name with a comma.
65
66 ``llc -stop-after=dead-mi-elimination,1 bug-trigger.ll > test.mir``
67
Alex Lorenzea788c42015-08-21 22:58:33 +000068After generating the input MIR file, you'll have to add a run line that uses
69the ``-run-pass`` option to it. In order to test the post register allocation
70pseudo instruction expansion pass on X86-64, a run line like the one shown
71below can be used:
72
Matthias Braune6185b72017-04-13 22:14:45 +000073 ``# RUN: llc -o - %s -mtriple=x86_64-- -run-pass=postrapseudos | FileCheck %s``
Alex Lorenzea788c42015-08-21 22:58:33 +000074
75The MIR files are target dependent, so they have to be placed in the target
Matthias Braune6185b72017-04-13 22:14:45 +000076specific test directories (``lib/CodeGen/TARGETNAME``). They also need to
77specify a target triple or a target architecture either in the run line or in
78the embedded LLVM IR module.
Alex Lorenzea788c42015-08-21 22:58:33 +000079
Matthias Braun836c3832017-04-13 23:45:14 +000080Simplifying MIR files
81^^^^^^^^^^^^^^^^^^^^^
82
83The MIR code coming out of ``-stop-after``/``-stop-before`` is very verbose;
84Tests are more accessible and future proof when simplified:
85
Matthias Braun89401142017-05-05 21:09:30 +000086- Use the ``-simplify-mir`` option with llc.
87
Matthias Braun836c3832017-04-13 23:45:14 +000088- Machine function attributes often have default values or the test works just
89 as well with default values. Typical candidates for this are: `alignment:`,
90 `exposesReturnsTwice`, `legalized`, `regBankSelected`, `selected`.
91 The whole `frameInfo` section is often unnecessary if there is no special
92 frame usage in the function. `tracksRegLiveness` on the other hand is often
93 necessary for some passes that care about block livein lists.
94
95- The (global) `liveins:` list is typically only interesting for early
96 instruction selection passes and can be removed when testing later passes.
97 The per-block `liveins:` on the other hand are necessary if
98 `tracksRegLiveness` is true.
99
100- Branch probability data in block `successors:` lists can be dropped if the
101 test doesn't depend on it. Example:
102 `successors: %bb.1(0x40000000), %bb.2(0x40000000)` can be replaced with
103 `successors: %bb.1, %bb.2`.
104
105- MIR code contains a whole IR module. This is necessary because there are
106 no equivalents in MIR for global variables, references to external functions,
107 function attributes, metadata, debug info. Instead some MIR data references
108 the IR constructs. You can often remove them if the test doesn't depend on
109 them.
110
111- Alias Analysis is performed on IR values. These are referenced by memory
112 operands in MIR. Example: `:: (load 8 from %ir.foobar, !alias.scope !9)`.
113 If the test doesn't depend on (good) alias analysis the references can be
114 dropped: `:: (load 8)`
115
116- MIR blocks can reference IR blocks for debug printing, profile information
117 or debug locations. Example: `bb.42.myblock` in MIR references the IR block
118 `myblock`. It is usually possible to drop the `.myblock` reference and simply
119 use `bb.42`.
120
121- If there are no memory operands or blocks referencing the IR then the
122 IR function can be replaced by a parameterless dummy function like
123 `define @func() { ret void }`.
124
125- It is possible to drop the whole IR section of the MIR file if it only
126 contains dummy functions (see above). The .mir loader will create the
127 IR functions automatically in this case.
128
Francis Visoiu Mistrih3c993712017-12-14 10:03:23 +0000129.. _limitations:
130
Alex Lorenzea788c42015-08-21 22:58:33 +0000131Limitations
132-----------
133
134Currently the MIR format has several limitations in terms of which state it
135can serialize:
136
137- The target-specific state in the target-specific ``MachineFunctionInfo``
138 subclasses isn't serialized at the moment.
139
140- The target-specific ``MachineConstantPoolValue`` subclasses (in the ARM and
141 SystemZ backends) aren't serialized at the moment.
142
Chandler Carruth75ca6be2018-08-16 23:11:05 +0000143- The ``MCSymbol`` machine operands don't support temporary or local symbols.
Alex Lorenzea788c42015-08-21 22:58:33 +0000144
145- A lot of the state in ``MachineModuleInfo`` isn't serialized - only the CFI
146 instructions and the variable debug information from MMI is serialized right
147 now.
148
149These limitations impose restrictions on what you can test with the MIR format.
150For now, tests that would like to test some behaviour that depends on the state
Chandler Carruth75ca6be2018-08-16 23:11:05 +0000151of temporary or local ``MCSymbol`` operands or the exception handling state in
152MMI, can't use the MIR format. As well as that, tests that test some behaviour
153that depends on the state of the target specific ``MachineFunctionInfo`` or
Alex Lorenzea788c42015-08-21 22:58:33 +0000154``MachineConstantPoolValue`` subclasses can't use the MIR format at the moment.
155
Alex Lorenz3d311772015-08-06 22:55:19 +0000156High Level Structure
157====================
158
Alex Lorenzd4990eb2015-09-08 11:38:16 +0000159.. _embedded-module:
160
Alex Lorenz3d311772015-08-06 22:55:19 +0000161Embedded Module
162---------------
163
164When the first YAML document contains a `YAML block literal string`_, the MIR
165parser will treat this string as an LLVM assembly language string that
166represents an embedded LLVM IR module.
167Here is an example of a YAML document that contains an LLVM module:
168
169.. code-block:: llvm
170
Alex Lorenz3d311772015-08-06 22:55:19 +0000171 define i32 @inc(i32* %x) {
172 entry:
173 %0 = load i32, i32* %x
174 %1 = add i32 %0, 1
175 store i32 %1, i32* %x
176 ret i32 %1
177 }
Alex Lorenz3d311772015-08-06 22:55:19 +0000178
179.. _YAML block literal string: http://www.yaml.org/spec/1.2/spec.html#id2795688
180
181Machine Functions
182-----------------
183
184The remaining YAML documents contain the machine functions. This is an example
185of such YAML document:
186
Renato Golin124f2592016-07-20 12:16:38 +0000187.. code-block:: text
Alex Lorenz3d311772015-08-06 22:55:19 +0000188
189 ---
190 name: inc
191 tracksRegLiveness: true
192 liveins:
Puyan Lotfi5bd46b02018-03-12 14:51:19 +0000193 - { reg: '$rdi' }
Alex Lorenz98461672015-08-14 00:36:10 +0000194 body: |
195 bb.0.entry:
Puyan Lotfi5bd46b02018-03-12 14:51:19 +0000196 liveins: $rdi
Alex Lorenz98461672015-08-14 00:36:10 +0000197
Puyan Lotfi5bd46b02018-03-12 14:51:19 +0000198 $eax = MOV32rm $rdi, 1, _, 0, _
199 $eax = INC32r killed $eax, implicit-def dead $eflags
200 MOV32mr killed $rdi, 1, _, 0, _, $eax
201 RETQ $eax
Alex Lorenz3d311772015-08-06 22:55:19 +0000202 ...
203
204The document above consists of attributes that represent the various
205properties and data structures in a machine function.
206
207The attribute ``name`` is required, and its value should be identical to the
208name of a function that this machine function is based on.
209
Alex Lorenz98461672015-08-14 00:36:10 +0000210The attribute ``body`` is a `YAML block literal string`_. Its value represents
211the function's machine basic blocks and their machine instructions.
Alex Lorenz3d311772015-08-06 22:55:19 +0000212
Alex Lorenz3a4a60c2015-08-15 01:06:06 +0000213Machine Instructions Format Reference
214=====================================
215
216The machine basic blocks and their instructions are represented using a custom,
217human readable serialization language. This language is used in the
218`YAML block literal string`_ that corresponds to the machine function's body.
219
220A source string that uses this language contains a list of machine basic
221blocks, which are described in the section below.
222
223Machine Basic Blocks
224--------------------
225
226A machine basic block is defined in a single block definition source construct
227that contains the block's ID.
228The example below defines two blocks that have an ID of zero and one:
229
Renato Golin124f2592016-07-20 12:16:38 +0000230.. code-block:: text
Alex Lorenz3a4a60c2015-08-15 01:06:06 +0000231
232 bb.0:
233 <instructions>
234 bb.1:
235 <instructions>
236
237A machine basic block can also have a name. It should be specified after the ID
238in the block's definition:
239
Renato Golin124f2592016-07-20 12:16:38 +0000240.. code-block:: text
Alex Lorenz3a4a60c2015-08-15 01:06:06 +0000241
242 bb.0.entry: ; This block's name is "entry"
243 <instructions>
244
245The block's name should be identical to the name of the IR block that this
246machine block is based on.
247
Francis Visoiu Mistrihb41dbbe2017-12-13 10:30:59 +0000248.. _block-references:
249
Alex Lorenz3a4a60c2015-08-15 01:06:06 +0000250Block References
251^^^^^^^^^^^^^^^^
252
253The machine basic blocks are identified by their ID numbers. Individual
254blocks are referenced using the following syntax:
255
Renato Golin124f2592016-07-20 12:16:38 +0000256.. code-block:: text
Alex Lorenz3a4a60c2015-08-15 01:06:06 +0000257
Francis Visoiu Mistrih25528d62017-12-04 17:18:51 +0000258 %bb.<id>
Alex Lorenz3a4a60c2015-08-15 01:06:06 +0000259
Francis Visoiu Mistrih25528d62017-12-04 17:18:51 +0000260Example:
Alex Lorenz3a4a60c2015-08-15 01:06:06 +0000261
262.. code-block:: llvm
263
264 %bb.0
Francis Visoiu Mistrih25528d62017-12-04 17:18:51 +0000265
266The following syntax is also supported, but the former syntax is preferred for
267block references:
268
269.. code-block:: text
270
271 %bb.<id>[.<name>]
272
273Example:
274
275.. code-block:: llvm
276
Alex Lorenz3a4a60c2015-08-15 01:06:06 +0000277 %bb.1.then
278
279Successors
280^^^^^^^^^^
281
282The machine basic block's successors have to be specified before any of the
283instructions:
284
Renato Golin124f2592016-07-20 12:16:38 +0000285.. code-block:: text
Alex Lorenz3a4a60c2015-08-15 01:06:06 +0000286
287 bb.0.entry:
288 successors: %bb.1.then, %bb.2.else
289 <instructions>
290 bb.1.then:
291 <instructions>
292 bb.2.else:
293 <instructions>
294
295The branch weights can be specified in brackets after the successor blocks.
296The example below defines a block that has two successors with branch weights
297of 32 and 16:
298
Renato Golin124f2592016-07-20 12:16:38 +0000299.. code-block:: text
Alex Lorenz3a4a60c2015-08-15 01:06:06 +0000300
301 bb.0.entry:
302 successors: %bb.1.then(32), %bb.2.else(16)
303
Alex Lorenzb981d372015-08-21 21:17:01 +0000304.. _bb-liveins:
305
Alex Lorenz3a4a60c2015-08-15 01:06:06 +0000306Live In Registers
307^^^^^^^^^^^^^^^^^
308
309The machine basic block's live in registers have to be specified before any of
310the instructions:
311
Renato Golin124f2592016-07-20 12:16:38 +0000312.. code-block:: text
Alex Lorenz3a4a60c2015-08-15 01:06:06 +0000313
314 bb.0.entry:
Puyan Lotfi5bd46b02018-03-12 14:51:19 +0000315 liveins: $edi, $esi
Alex Lorenz3a4a60c2015-08-15 01:06:06 +0000316
317The list of live in registers and successors can be empty. The language also
318allows multiple live in register and successor lists - they are combined into
319one list by the parser.
320
321Miscellaneous Attributes
322^^^^^^^^^^^^^^^^^^^^^^^^
323
324The attributes ``IsAddressTaken``, ``IsLandingPad`` and ``Alignment`` can be
325specified in brackets after the block's definition:
326
Renato Golin124f2592016-07-20 12:16:38 +0000327.. code-block:: text
Alex Lorenz3a4a60c2015-08-15 01:06:06 +0000328
329 bb.0.entry (address-taken):
330 <instructions>
331 bb.2.else (align 4):
332 <instructions>
333 bb.3(landing-pad, align 4):
334 <instructions>
335
336.. TODO: Describe the way the reference to an unnamed LLVM IR block can be
337 preserved.
338
Alex Lorenz8eadc3f2015-08-21 17:26:38 +0000339Machine Instructions
340--------------------
341
Alex Lorenzb981d372015-08-21 21:17:01 +0000342A machine instruction is composed of a name,
343:ref:`machine operands <machine-operands>`,
Alex Lorenz8eadc3f2015-08-21 17:26:38 +0000344:ref:`instruction flags <instruction-flags>`, and machine memory operands.
345
346The instruction's name is usually specified before the operands. The example
347below shows an instance of the X86 ``RETQ`` instruction with a single machine
348operand:
349
Renato Golin124f2592016-07-20 12:16:38 +0000350.. code-block:: text
Alex Lorenz8eadc3f2015-08-21 17:26:38 +0000351
Puyan Lotfi5bd46b02018-03-12 14:51:19 +0000352 RETQ $eax
Alex Lorenz8eadc3f2015-08-21 17:26:38 +0000353
354However, if the machine instruction has one or more explicitly defined register
355operands, the instruction's name has to be specified after them. The example
356below shows an instance of the AArch64 ``LDPXpost`` instruction with three
357defined register operands:
358
Renato Golin124f2592016-07-20 12:16:38 +0000359.. code-block:: text
Alex Lorenz8eadc3f2015-08-21 17:26:38 +0000360
Puyan Lotfi5bd46b02018-03-12 14:51:19 +0000361 $sp, $fp, $lr = LDPXpost $sp, 2
Alex Lorenz8eadc3f2015-08-21 17:26:38 +0000362
363The instruction names are serialized using the exact definitions from the
364target's ``*InstrInfo.td`` files, and they are case sensitive. This means that
365similar instruction names like ``TSTri`` and ``tSTRi`` represent different
366machine instructions.
367
368.. _instruction-flags:
369
370Instruction Flags
371^^^^^^^^^^^^^^^^^
372
Francis Visoiu Mistrihdbf2c482018-01-09 11:33:22 +0000373The flag ``frame-setup`` or ``frame-destroy`` can be specified before the
374instruction's name:
Alex Lorenz8eadc3f2015-08-21 17:26:38 +0000375
Renato Golin124f2592016-07-20 12:16:38 +0000376.. code-block:: text
Alex Lorenz8eadc3f2015-08-21 17:26:38 +0000377
Puyan Lotfi5bd46b02018-03-12 14:51:19 +0000378 $fp = frame-setup ADDXri $sp, 0, 0
Alex Lorenz8eadc3f2015-08-21 17:26:38 +0000379
Francis Visoiu Mistrihdbf2c482018-01-09 11:33:22 +0000380.. code-block:: text
381
Puyan Lotfi5bd46b02018-03-12 14:51:19 +0000382 $x21, $x20 = frame-destroy LDPXi $sp
Francis Visoiu Mistrihdbf2c482018-01-09 11:33:22 +0000383
Alex Lorenzb981d372015-08-21 21:17:01 +0000384.. _registers:
385
Francis Visoiu Mistrih58367902018-01-10 17:53:16 +0000386Bundled Instructions
387^^^^^^^^^^^^^^^^^^^^
388
389The syntax for bundled instructions is the following:
390
391.. code-block:: text
392
Puyan Lotfi5bd46b02018-03-12 14:51:19 +0000393 BUNDLE implicit-def $r0, implicit-def $r1, implicit $r2 {
394 $r0 = SOME_OP $r2
395 $r1 = ANOTHER_OP internal $r0
Francis Visoiu Mistrih58367902018-01-10 17:53:16 +0000396 }
397
398The first instruction is often a bundle header. The instructions between ``{``
399and ``}`` are bundled with the first instruction.
400
Alex Lorenzb981d372015-08-21 21:17:01 +0000401Registers
402---------
403
404Registers are one of the key primitives in the machine instructions
Hiroshi Inouec36a1f12018-06-15 05:10:09 +0000405serialization language. They are primarily used in the
Alex Lorenzb981d372015-08-21 21:17:01 +0000406:ref:`register machine operands <register-operands>`,
407but they can also be used in a number of other places, like the
408:ref:`basic block's live in list <bb-liveins>`.
409
Puyan Lotfi5bd46b02018-03-12 14:51:19 +0000410The physical registers are identified by their name and by the '$' prefix sigil.
411They use the following syntax:
Alex Lorenzb981d372015-08-21 21:17:01 +0000412
Renato Golin124f2592016-07-20 12:16:38 +0000413.. code-block:: text
Alex Lorenzb981d372015-08-21 21:17:01 +0000414
Puyan Lotfi5bd46b02018-03-12 14:51:19 +0000415 $<name>
Alex Lorenzb981d372015-08-21 21:17:01 +0000416
417The example below shows three X86 physical registers:
418
Renato Golin124f2592016-07-20 12:16:38 +0000419.. code-block:: text
Alex Lorenzb981d372015-08-21 21:17:01 +0000420
Puyan Lotfi5bd46b02018-03-12 14:51:19 +0000421 $eax
422 $r15
423 $eflags
Alex Lorenzb981d372015-08-21 21:17:01 +0000424
Puyan Lotfi5bd46b02018-03-12 14:51:19 +0000425The virtual registers are identified by their ID number and by the '%' sigil.
426They use the following syntax:
Alex Lorenzb981d372015-08-21 21:17:01 +0000427
Renato Golin124f2592016-07-20 12:16:38 +0000428.. code-block:: text
Alex Lorenzb981d372015-08-21 21:17:01 +0000429
430 %<id>
431
432Example:
433
Renato Golin124f2592016-07-20 12:16:38 +0000434.. code-block:: text
Alex Lorenzb981d372015-08-21 21:17:01 +0000435
436 %0
437
438The null registers are represented using an underscore ('``_``'). They can also be
Puyan Lotfi5bd46b02018-03-12 14:51:19 +0000439represented using a '``$noreg``' named register, although the former syntax
Alex Lorenzb981d372015-08-21 21:17:01 +0000440is preferred.
441
442.. _machine-operands:
443
444Machine Operands
445----------------
446
Chandler Carruth75ca6be2018-08-16 23:11:05 +0000447There are seventeen different kinds of machine operands, and all of them can be
448serialized.
Alex Lorenzb981d372015-08-21 21:17:01 +0000449
450Immediate Operands
451^^^^^^^^^^^^^^^^^^
452
453The immediate machine operands are untyped, 64-bit signed integers. The
454example below shows an instance of the X86 ``MOV32ri`` instruction that has an
455immediate machine operand ``-42``:
456
Renato Golin124f2592016-07-20 12:16:38 +0000457.. code-block:: text
Alex Lorenzb981d372015-08-21 21:17:01 +0000458
Puyan Lotfi5bd46b02018-03-12 14:51:19 +0000459 $eax = MOV32ri -42
Alex Lorenzb981d372015-08-21 21:17:01 +0000460
Francis Visoiu Mistrih440f69c2017-12-08 22:53:21 +0000461An immediate operand is also used to represent a subregister index when the
462machine instruction has one of the following opcodes:
463
464- ``EXTRACT_SUBREG``
465
466- ``INSERT_SUBREG``
467
468- ``REG_SEQUENCE``
469
470- ``SUBREG_TO_REG``
471
472In case this is true, the Machine Operand is printed according to the target.
473
474For example:
475
476In AArch64RegisterInfo.td:
477
478.. code-block:: text
479
480 def sub_32 : SubRegIndex<32>;
481
482If the third operand is an immediate with the value ``15`` (target-dependent
483value), based on the instruction's opcode and the operand's index the operand
484will be printed as ``%subreg.sub_32``:
485
486.. code-block:: text
487
488 %1:gpr64 = SUBREG_TO_REG 0, %0, %subreg.sub_32
489
Francis Visoiu Mistrih6c4ca712017-12-08 11:40:06 +0000490For integers > 64bit, we use a special machine operand, ``MO_CImmediate``,
491which stores the immediate in a ``ConstantInt`` using an ``APInt`` (LLVM's
492arbitrary precision integers).
493
494.. TODO: Describe the FPIMM immediate operands.
Alex Lorenzb981d372015-08-21 21:17:01 +0000495
496.. _register-operands:
497
498Register Operands
499^^^^^^^^^^^^^^^^^
500
501The :ref:`register <registers>` primitive is used to represent the register
502machine operands. The register operands can also have optional
503:ref:`register flags <register-flags>`,
Alex Lorenz37e02622015-09-08 11:39:47 +0000504:ref:`a subregister index <subregister-indices>`,
505and a reference to the tied register operand.
Alex Lorenzb981d372015-08-21 21:17:01 +0000506The full syntax of a register operand is shown below:
507
Renato Golin124f2592016-07-20 12:16:38 +0000508.. code-block:: text
Alex Lorenzb981d372015-08-21 21:17:01 +0000509
510 [<flags>] <register> [ :<subregister-idx-name> ] [ (tied-def <tied-op>) ]
511
512This example shows an instance of the X86 ``XOR32rr`` instruction that has
5135 register operands with different register flags:
514
Renato Golin124f2592016-07-20 12:16:38 +0000515.. code-block:: text
Alex Lorenzb981d372015-08-21 21:17:01 +0000516
Puyan Lotfi5bd46b02018-03-12 14:51:19 +0000517 dead $eax = XOR32rr undef $eax, undef $eax, implicit-def dead $eflags, implicit-def $al
Alex Lorenzb981d372015-08-21 21:17:01 +0000518
519.. _register-flags:
520
521Register Flags
522~~~~~~~~~~~~~~
523
524The table below shows all of the possible register flags along with the
525corresponding internal ``llvm::RegState`` representation:
526
527.. list-table::
528 :header-rows: 1
529
530 * - Flag
531 - Internal Value
532
533 * - ``implicit``
534 - ``RegState::Implicit``
535
536 * - ``implicit-def``
537 - ``RegState::ImplicitDefine``
538
539 * - ``def``
540 - ``RegState::Define``
541
542 * - ``dead``
543 - ``RegState::Dead``
544
545 * - ``killed``
546 - ``RegState::Kill``
547
548 * - ``undef``
549 - ``RegState::Undef``
550
551 * - ``internal``
552 - ``RegState::InternalRead``
553
554 * - ``early-clobber``
555 - ``RegState::EarlyClobber``
556
557 * - ``debug-use``
558 - ``RegState::Debug``
Alex Lorenz3a4a60c2015-08-15 01:06:06 +0000559
Geoff Berry60c43102017-12-12 17:53:59 +0000560 * - ``renamable``
561 - ``RegState::Renamable``
562
Alex Lorenz37e02622015-09-08 11:39:47 +0000563.. _subregister-indices:
564
565Subregister Indices
566~~~~~~~~~~~~~~~~~~~
567
568The register machine operands can reference a portion of a register by using
569the subregister indices. The example below shows an instance of the ``COPY``
570pseudo instruction that uses the X86 ``sub_8bit`` subregister index to copy 8
571lower bits from the 32-bit virtual register 0 to the 8-bit virtual register 1:
572
Renato Golin124f2592016-07-20 12:16:38 +0000573.. code-block:: text
Alex Lorenz37e02622015-09-08 11:39:47 +0000574
575 %1 = COPY %0:sub_8bit
576
577The names of the subregister indices are target specific, and are typically
578defined in the target's ``*RegisterInfo.td`` file.
579
Francis Visoiu Mistrih26ae8a62017-12-13 10:30:45 +0000580Constant Pool Indices
581^^^^^^^^^^^^^^^^^^^^^
582
583A constant pool index (CPI) operand is printed using its index in the
584function's ``MachineConstantPool`` and an offset.
585
586For example, a CPI with the index 1 and offset 8:
587
588.. code-block:: text
589
590 %1:gr64 = MOV64ri %const.1 + 8
591
592For a CPI with the index 0 and offset -12:
593
594.. code-block:: text
595
596 %1:gr64 = MOV64ri %const.0 - 12
597
598A constant pool entry is bound to a LLVM IR ``Constant`` or a target-specific
599``MachineConstantPoolValue``. When serializing all the function's constants the
600following format is used:
601
602.. code-block:: text
603
604 constants:
605 - id: <index>
606 value: <value>
607 alignment: <alignment>
608 isTargetSpecific: <target-specific>
609
610where ``<index>`` is a 32-bit unsigned integer, ``<value>`` is a `LLVM IR Constant
611<https://www.llvm.org/docs/LangRef.html#constants>`_, alignment is a 32-bit
612unsigned integer, and ``<target-specific>`` is either true or false.
613
614Example:
615
616.. code-block:: text
617
618 constants:
619 - id: 0
620 value: 'double 3.250000e+00'
621 alignment: 8
622 - id: 1
623 value: 'g-(LPC0+8)'
624 alignment: 4
625 isTargetSpecific: true
626
Alex Lorenzd4990eb2015-09-08 11:38:16 +0000627Global Value Operands
628^^^^^^^^^^^^^^^^^^^^^
629
630The global value machine operands reference the global values from the
631:ref:`embedded LLVM IR module <embedded-module>`.
632The example below shows an instance of the X86 ``MOV64rm`` instruction that has
633a global value operand named ``G``:
634
Renato Golin124f2592016-07-20 12:16:38 +0000635.. code-block:: text
Alex Lorenzd4990eb2015-09-08 11:38:16 +0000636
Puyan Lotfi5bd46b02018-03-12 14:51:19 +0000637 $rax = MOV64rm $rip, 1, _, @G, _
Alex Lorenzd4990eb2015-09-08 11:38:16 +0000638
639The named global values are represented using an identifier with the '@' prefix.
640If the identifier doesn't match the regular expression
641`[-a-zA-Z$._][-a-zA-Z$._0-9]*`, then this identifier must be quoted.
642
643The unnamed global values are represented using an unsigned numeric value with
644the '@' prefix, like in the following examples: ``@0``, ``@989``.
645
Francis Visoiu Mistrihb3a0d512017-12-13 10:30:51 +0000646Target-dependent Index Operands
647^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
648
649A target index operand is a target-specific index and an offset. The
650target-specific index is printed using target-specific names and a positive or
651negative offset.
652
653For example, the ``amdgpu-constdata-start`` is associated with the index ``0``
654in the AMDGPU backend. So if we have a target index operand with the index 0
655and the offset 8:
656
657.. code-block:: text
658
Puyan Lotfi5bd46b02018-03-12 14:51:19 +0000659 $sgpr2 = S_ADD_U32 _, target-index(amdgpu-constdata-start) + 8, implicit-def _, implicit-def _
Francis Visoiu Mistrihb3a0d512017-12-13 10:30:51 +0000660
Francis Visoiu Mistrihb41dbbe2017-12-13 10:30:59 +0000661Jump-table Index Operands
662^^^^^^^^^^^^^^^^^^^^^^^^^
663
664A jump-table index operand with the index 0 is printed as following:
665
666.. code-block:: text
667
Puyan Lotfi5bd46b02018-03-12 14:51:19 +0000668 tBR_JTr killed $r0, %jump-table.0
Francis Visoiu Mistrihb41dbbe2017-12-13 10:30:59 +0000669
670A machine jump-table entry contains a list of ``MachineBasicBlocks``. When serializing all the function's jump-table entries, the following format is used:
671
672.. code-block:: text
673
674 jumpTable:
675 kind: <kind>
676 entries:
677 - id: <index>
678 blocks: [ <bbreference>, <bbreference>, ... ]
679
680where ``<kind>`` is describing how the jump table is represented and emitted (plain address, relocations, PIC, etc.), and each ``<index>`` is a 32-bit unsigned integer and ``blocks`` contains a list of :ref:`machine basic block references <block-references>`.
681
682Example:
683
684.. code-block:: text
685
686 jumpTable:
687 kind: inline
688 entries:
689 - id: 0
690 blocks: [ '%bb.3', '%bb.9', '%bb.4.d3' ]
691 - id: 1
692 blocks: [ '%bb.7', '%bb.7', '%bb.4.d3', '%bb.5' ]
693
Francis Visoiu Mistrihe76c5fc2017-12-14 10:02:58 +0000694External Symbol Operands
695^^^^^^^^^^^^^^^^^^^^^^^^^
696
Puyan Lotfi5bd46b02018-03-12 14:51:19 +0000697An external symbol operand is represented using an identifier with the ``&``
Francis Visoiu Mistrihe76c5fc2017-12-14 10:02:58 +0000698prefix. The identifier is surrounded with ""'s and escaped if it has any
699special non-printable characters in it.
700
701Example:
702
703.. code-block:: text
704
Puyan Lotfi5bd46b02018-03-12 14:51:19 +0000705 CALL64pcrel32 &__stack_chk_fail, csr_64, implicit $rsp, implicit-def $rsp
Francis Visoiu Mistrihe76c5fc2017-12-14 10:02:58 +0000706
Francis Visoiu Mistrih3c993712017-12-14 10:03:23 +0000707MCSymbol Operands
708^^^^^^^^^^^^^^^^^
709
710A MCSymbol operand is holding a pointer to a ``MCSymbol``. For the limitations
711of this operand in MIR, see :ref:`limitations <limitations>`.
712
713The syntax is:
714
715.. code-block:: text
716
717 EH_LABEL <mcsymbol Ltmp1>
Francis Visoiu Mistrihe76c5fc2017-12-14 10:02:58 +0000718
Francis Visoiu Mistrih874ae6f2017-12-19 16:51:52 +0000719CFIIndex Operands
720^^^^^^^^^^^^^^^^^
721
722A CFI Index operand is holding an index into a per-function side-table,
723``MachineFunction::getFrameInstructions()``, which references all the frame
724instructions in a ``MachineFunction``. A ``CFI_INSTRUCTION`` may look like it
725contains multiple operands, but the only operand it contains is the CFI Index.
726The other operands are tracked by the ``MCCFIInstruction`` object.
727
728The syntax is:
729
730.. code-block:: text
731
Puyan Lotfi5bd46b02018-03-12 14:51:19 +0000732 CFI_INSTRUCTION offset $w30, -16
Francis Visoiu Mistrih874ae6f2017-12-19 16:51:52 +0000733
734which may be emitted later in the MC layer as:
735
736.. code-block:: text
737
738 .cfi_offset w30, -16
739
Francis Visoiu Mistrihbbd610a2017-12-19 21:47:05 +0000740IntrinsicID Operands
741^^^^^^^^^^^^^^^^^^^^
742
743An Intrinsic ID operand contains a generic intrinsic ID or a target-specific ID.
744
745The syntax for the ``returnaddress`` intrinsic is:
746
747.. code-block:: text
748
Puyan Lotfi5bd46b02018-03-12 14:51:19 +0000749 $x0 = COPY intrinsic(@llvm.returnaddress)
Francis Visoiu Mistrihbbd610a2017-12-19 21:47:05 +0000750
Francis Visoiu Mistrihcb2683d2017-12-19 21:47:10 +0000751Predicate Operands
752^^^^^^^^^^^^^^^^^^
753
754A Predicate operand contains an IR predicate from ``CmpInst::Predicate``, like
755``ICMP_EQ``, etc.
756
757For an int eq predicate ``ICMP_EQ``, the syntax is:
758
759.. code-block:: text
760
761 %2:gpr(s32) = G_ICMP intpred(eq), %0, %1
762
Alex Lorenz3d311772015-08-06 22:55:19 +0000763.. TODO: Describe the parsers default behaviour when optional YAML attributes
764 are missing.
Alex Lorenzb981d372015-08-21 21:17:01 +0000765.. TODO: Describe the syntax for virtual register YAML definitions.
Alex Lorenz3d311772015-08-06 22:55:19 +0000766.. TODO: Describe the machine function's YAML flag attributes.
Francis Visoiu Mistrihe76c5fc2017-12-14 10:02:58 +0000767.. TODO: Describe the syntax for the register mask machine operands.
Alex Lorenz3d311772015-08-06 22:55:19 +0000768.. TODO: Describe the frame information YAML mapping.
769.. TODO: Describe the syntax of the stack object machine operands and their
770 YAML definitions.
Alex Lorenz3d311772015-08-06 22:55:19 +0000771.. TODO: Describe the syntax of the block address machine operands.
Alex Lorenz3d311772015-08-06 22:55:19 +0000772.. TODO: Describe the syntax of the metadata machine operands, and the
773 instructions debug location attribute.
Alex Lorenz3d311772015-08-06 22:55:19 +0000774.. TODO: Describe the syntax of the register live out machine operands.
775.. TODO: Describe the syntax of the machine memory operands.