blob: cfdcc2b889232b80478bd3366c70ccaf5426e849 [file] [log] [blame]
Chris Lattnerce52b7e2004-06-01 06:48:00 +00001<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01//EN"
2 "http://www.w3.org/TR/html4/strict.dtd">
3<html>
4<head>
5 <title>The LLVM Target-Independent Code Generator</title>
6 <link rel="stylesheet" href="llvm.css" type="text/css">
7</head>
8<body>
9
10<div class="doc_title">
11 The LLVM Target-Independent Code Generator
12</div>
13
14<ol>
15 <li><a href="#introduction">Introduction</a>
16 <ul>
17 <li><a href="#required">Required components in the code generator</a></li>
18 <li><a href="#high-level-design">The high-level design of the code generator</a></li>
19 <li><a href="#tablegen">Using TableGen for target description</a></li>
20 </ul>
21 </li>
22 <li><a href="#targetdesc">Target description classes</a>
23 <ul>
24 <li><a href="#targetmachine">The <tt>TargetMachine</tt> class</a></li>
25 <li><a href="#targetdata">The <tt>TargetData</tt> class</a></li>
Chris Lattneraa5bcb52005-01-28 17:22:53 +000026 <li><a href="#targetlowering">The <tt>TargetLowering</tt> class</a></li>
Chris Lattnerce52b7e2004-06-01 06:48:00 +000027 <li><a href="#mregisterinfo">The <tt>MRegisterInfo</tt> class</a></li>
28 <li><a href="#targetinstrinfo">The <tt>TargetInstrInfo</tt> class</a></li>
29 <li><a href="#targetframeinfo">The <tt>TargetFrameInfo</tt> class</a></li>
30 <li><a href="#targetjitinfo">The <tt>TargetJITInfo</tt> class</a></li>
31 </ul>
32 </li>
33 <li><a href="#codegendesc">Machine code description classes</a>
Chris Lattnerec94f802004-06-04 00:16:02 +000034 <ul>
Chris Lattneraa5bcb52005-01-28 17:22:53 +000035 <li><a href="#machineinstr">The <tt>MachineInstr</tt> class</a></li>
Chris Lattnerec94f802004-06-04 00:16:02 +000036 </ul>
Chris Lattnerce52b7e2004-06-01 06:48:00 +000037 </li>
38 <li><a href="#codegenalgs">Target-independent code generation algorithms</a>
Chris Lattneraa5bcb52005-01-28 17:22:53 +000039 <ul>
40 <li><a href="#instselect">Instruction Selection</a>
41 <ul>
42 <li><a href="#selectiondag_intro">Introduction to SelectionDAGs</a></li>
43 <li><a href="#selectiondag_process">SelectionDAG Code Generation
44 Process</a></li>
45 <li><a href="#selectiondag_build">Initial SelectionDAG
46 Construction</a></li>
47 <li><a href="#selectiondag_legalize">SelectionDAG Legalize Phase</a></li>
48 <li><a href="#selectiondag_optimize">SelectionDAG Optimization
49 Phase</a></li>
50 <li><a href="#selectiondag_select">SelectionDAG Select Phase</a></li>
51 <li><a href="#selectiondag_future">Future directions for the
52 SelectionDAG</a></li>
53 </ul></li>
54 </ul>
Chris Lattnerce52b7e2004-06-01 06:48:00 +000055 </li>
56 <li><a href="#targetimpls">Target description implementations</a>
57 <ul>
Chris Lattneraa5bcb52005-01-28 17:22:53 +000058 <li><a href="#x86">The X86 backend</a></li>
Chris Lattner10d68002004-06-01 17:18:11 +000059 </ul>
Chris Lattnerce52b7e2004-06-01 06:48:00 +000060 </li>
61
62</ol>
63
64<div class="doc_author">
65 <p>Written by <a href="mailto:sabre@nondot.org">Chris Lattner</a></p>
66</div>
67
Chris Lattner10d68002004-06-01 17:18:11 +000068<div class="doc_warning">
69 <p>Warning: This is a work in progress.</p>
70</div>
71
Chris Lattnerce52b7e2004-06-01 06:48:00 +000072<!-- *********************************************************************** -->
73<div class="doc_section">
74 <a name="introduction">Introduction</a>
75</div>
76<!-- *********************************************************************** -->
77
78<div class="doc_text">
79
80<p>The LLVM target-independent code generator is a framework that provides a
81suite of reusable components for translating the LLVM internal representation to
82the machine code for a specified target -- either in assembly form (suitable for
83a static compiler) or in binary machine code format (usable for a JIT compiler).
Chris Lattnerec94f802004-06-04 00:16:02 +000084The LLVM target-independent code generator consists of five main components:</p>
Chris Lattnerce52b7e2004-06-01 06:48:00 +000085
86<ol>
87<li><a href="#targetdesc">Abstract target description</a> interfaces which
Reid Spencerbdbcb8a2004-06-05 14:39:24 +000088capture important properties about various aspects of the machine, independently
Chris Lattnerce52b7e2004-06-01 06:48:00 +000089of how they will be used. These interfaces are defined in
90<tt>include/llvm/Target/</tt>.</li>
91
92<li>Classes used to represent the <a href="#codegendesc">machine code</a> being
Reid Spencerbdbcb8a2004-06-05 14:39:24 +000093generated for a target. These classes are intended to be abstract enough to
Chris Lattnerce52b7e2004-06-01 06:48:00 +000094represent the machine code for <i>any</i> target machine. These classes are
95defined in <tt>include/llvm/CodeGen/</tt>.</li>
96
97<li><a href="#codegenalgs">Target-independent algorithms</a> used to implement
98various phases of native code generation (register allocation, scheduling, stack
99frame representation, etc). This code lives in <tt>lib/CodeGen/</tt>.</li>
100
101<li><a href="#targetimpls">Implementations of the abstract target description
102interfaces</a> for particular targets. These machine descriptions make use of
103the components provided by LLVM, and can optionally provide custom
104target-specific passes, to build complete code generators for a specific target.
105Target descriptions live in <tt>lib/Target/</tt>.</li>
106
Chris Lattnerec94f802004-06-04 00:16:02 +0000107<li><a href="#jit">The target-independent JIT components</a>. The LLVM JIT is
108completely target independent (it uses the <tt>TargetJITInfo</tt> structure to
109interface for target-specific issues. The code for the target-independent
110JIT lives in <tt>lib/ExecutionEngine/JIT</tt>.</li>
111
Chris Lattnerce52b7e2004-06-01 06:48:00 +0000112</ol>
113
114<p>
115Depending on which part of the code generator you are interested in working on,
116different pieces of this will be useful to you. In any case, you should be
117familiar with the <a href="#targetdesc">target description</a> and <a
118href="#codegendesc">machine code representation</a> classes. If you want to add
Reid Spencerbdbcb8a2004-06-05 14:39:24 +0000119a backend for a new target, you will need to <a href="#targetimpls">implement the
120target description</a> classes for your new target and understand the <a
Chris Lattnerce52b7e2004-06-01 06:48:00 +0000121href="LangRef.html">LLVM code representation</a>. If you are interested in
122implementing a new <a href="#codegenalgs">code generation algorithm</a>, it
123should only depend on the target-description and machine code representation
124classes, ensuring that it is portable.
125</p>
126
127</div>
128
129<!-- ======================================================================= -->
130<div class="doc_subsection">
131 <a name="required">Required components in the code generator</a>
132</div>
133
134<div class="doc_text">
135
136<p>The two pieces of the LLVM code generator are the high-level interface to the
137code generator and the set of reusable components that can be used to build
138target-specific backends. The two most important interfaces (<a
139href="#targetmachine"><tt>TargetMachine</tt></a> and <a
140href="#targetdata"><tt>TargetData</tt></a> classes) are the only ones that are
141required to be defined for a backend to fit into the LLVM system, but the others
142must be defined if the reusable code generator components are going to be
143used.</p>
144
145<p>This design has two important implications. The first is that LLVM can
146support completely non-traditional code generation targets. For example, the C
147backend does not require register allocation, instruction selection, or any of
148the other standard components provided by the system. As such, it only
149implements these two interfaces, and does its own thing. Another example of a
150code generator like this is a (purely hypothetical) backend that converts LLVM
151to the GCC RTL form and uses GCC to emit machine code for a target.</p>
152
Reid Spencerbdbcb8a2004-06-05 14:39:24 +0000153<p>This design also implies that it is possible to design and
Chris Lattnerce52b7e2004-06-01 06:48:00 +0000154implement radically different code generators in the LLVM system that do not
155make use of any of the built-in components. Doing so is not recommended at all,
156but could be required for radically different targets that do not fit into the
157LLVM machine description model: programmable FPGAs for example.</p>
Chris Lattner900bf8c2004-06-02 07:06:06 +0000158
159<p><b>Important Note:</b> For historical reasons, the LLVM SparcV9 code
160generator uses almost entirely different code paths than described in this
161document. For this reason, there are some deprecated interfaces (such as
162<tt>TargetRegInfo</tt> and <tt>TargetSchedInfo</tt>), which are only used by the
163V9 backend and should not be used by any other targets. Also, all code in the
164<tt>lib/Target/SparcV9</tt> directory and subdirectories should be considered
165deprecated, and should not be used as the basis for future code generator work.
Misha Brukmanf3709d62004-06-03 16:55:57 +0000166The SparcV9 backend is slowly being merged into the rest of the
167target-independent code generators, but this is a low-priority process with no
Chris Lattner900bf8c2004-06-02 07:06:06 +0000168predictable completion date.</p>
169
Chris Lattnerce52b7e2004-06-01 06:48:00 +0000170</div>
171
172<!-- ======================================================================= -->
173<div class="doc_subsection">
Chris Lattner10d68002004-06-01 17:18:11 +0000174 <a name="high-level-design">The high-level design of the code generator</a>
Chris Lattnerce52b7e2004-06-01 06:48:00 +0000175</div>
176
177<div class="doc_text">
178
Chris Lattneraa5bcb52005-01-28 17:22:53 +0000179<p>The LLVM target-independent code generator is designed to support efficient and
Chris Lattnerce52b7e2004-06-01 06:48:00 +0000180quality code generation for standard register-based microprocessors. Code
181generation in this model is divided into the following stages:</p>
182
183<ol>
Chris Lattneraa5bcb52005-01-28 17:22:53 +0000184<li><b><a href="#instselect">Instruction Selection</a></b> - Determining an
185efficient implementation of the input LLVM code in the target instruction set.
186This stage produces the initial code for the program in the target instruction
187set, then makes use of virtual registers in SSA form and physical registers that
188represent any required register assignments due to target constraints or calling
189conventions.</li>
Chris Lattnerce52b7e2004-06-01 06:48:00 +0000190
191<li><b>SSA-based Machine Code Optimizations</b> - This (optional) stage consists
192of a series of machine-code optimizations that operate on the SSA-form produced
193by the instruction selector. Optimizations like modulo-scheduling, normal
194scheduling, or peephole optimization work here.</li>
195
196<li><b>Register Allocation</b> - The target code is transformed from an infinite
197virtual register file in SSA form to the concrete register file used by the
198target. This phase introduces spill code and eliminates all virtual register
199references from the program.</li>
200
201<li><b>Prolog/Epilog Code Insertion</b> - Once the machine code has been
202generated for the function and the amount of stack space required is known (used
203for LLVM alloca's and spill slots), the prolog and epilog code for the function
204can be inserted and "abstract stack location references" can be eliminated.
205This stage is responsible for implementing optimizations like frame-pointer
206elimination and stack packing.</li>
207
208<li><b>Late Machine Code Optimizations</b> - Optimizations that operate on
209"final" machine code can go here, such as spill code scheduling and peephole
210optimizations.</li>
211
Reid Spencerbdbcb8a2004-06-05 14:39:24 +0000212<li><b>Code Emission</b> - The final stage actually outputs the code for
Chris Lattnerce52b7e2004-06-01 06:48:00 +0000213the current function, either in the target assembler format or in machine
214code.</li>
215
216</ol>
217
218<p>
219The code generator is based on the assumption that the instruction selector will
220use an optimal pattern matching selector to create high-quality sequences of
Reid Spencerbdbcb8a2004-06-05 14:39:24 +0000221native instructions. Alternative code generator designs based on pattern
222expansion and
223aggressive iterative peephole optimization are much slower. This design
224permits efficient compilation (important for JIT environments) and
225aggressive optimization (used when generating code offline) by allowing
Chris Lattneraa5bcb52005-01-28 17:22:53 +0000226components of varying levels of sophistication to be used for any step of
Reid Spencerbdbcb8a2004-06-05 14:39:24 +0000227compilation.</p>
Chris Lattnerce52b7e2004-06-01 06:48:00 +0000228
229<p>
230In addition to these stages, target implementations can insert arbitrary
231target-specific passes into the flow. For example, the X86 target uses a
232special pass to handle the 80x87 floating point stack architecture. Other
233targets with unusual requirements can be supported with custom passes as needed.
234</p>
235
236</div>
237
238
239<!-- ======================================================================= -->
240<div class="doc_subsection">
Chris Lattner10d68002004-06-01 17:18:11 +0000241 <a name="tablegen">Using TableGen for target description</a>
Chris Lattnerce52b7e2004-06-01 06:48:00 +0000242</div>
243
244<div class="doc_text">
245
Chris Lattner5489e932004-06-01 18:35:00 +0000246<p>The target description classes require a detailed description of the target
Chris Lattnerce52b7e2004-06-01 06:48:00 +0000247architecture. These target descriptions often have a large amount of common
248information (e.g., an add instruction is almost identical to a sub instruction).
249In order to allow the maximum amount of commonality to be factored out, the LLVM
250code generator uses the <a href="TableGenFundamentals.html">TableGen</a> tool to
Chris Lattner5489e932004-06-01 18:35:00 +0000251describe big chunks of the target machine, which allows the use of domain- and
252target-specific abstractions to reduce the amount of repetition.
Chris Lattnerce52b7e2004-06-01 06:48:00 +0000253</p>
254
255</div>
256
257<!-- *********************************************************************** -->
258<div class="doc_section">
259 <a name="targetdesc">Target description classes</a>
260</div>
261<!-- *********************************************************************** -->
262
263<div class="doc_text">
264
265<p>The LLVM target description classes (which are located in the
266<tt>include/llvm/Target</tt> directory) provide an abstract description of the
267target machine, independent of any particular client. These classes are
268designed to capture the <i>abstract</i> properties of the target (such as what
269instruction and registers it has), and do not incorporate any particular pieces
270of code generation algorithms (these interfaces do not take interference graphs
271as inputs or other algorithm-specific data structures).</p>
272
273<p>All of the target description classes (except the <tt><a
274href="#targetdata">TargetData</a></tt> class) are designed to be subclassed by
275the concrete target implementation, and have virtual methods implemented. To
Reid Spencerbdbcb8a2004-06-05 14:39:24 +0000276get to these implementations, the <tt><a
Chris Lattnerce52b7e2004-06-01 06:48:00 +0000277href="#targetmachine">TargetMachine</a></tt> class provides accessors that
278should be implemented by the target.</p>
279
280</div>
281
282<!-- ======================================================================= -->
283<div class="doc_subsection">
284 <a name="targetmachine">The <tt>TargetMachine</tt> class</a>
285</div>
286
287<div class="doc_text">
288
289<p>The <tt>TargetMachine</tt> class provides virtual methods that are used to
290access the target-specific implementations of the various target description
291classes (with the <tt>getInstrInfo</tt>, <tt>getRegisterInfo</tt>,
Reid Spencerbdbcb8a2004-06-05 14:39:24 +0000292<tt>getFrameInfo</tt>, ... methods). This class is designed to be specialized by
Chris Lattnerce52b7e2004-06-01 06:48:00 +0000293a concrete target implementation (e.g., <tt>X86TargetMachine</tt>) which
294implements the various virtual methods. The only required target description
295class is the <a href="#targetdata"><tt>TargetData</tt></a> class, but if the
296code generator components are to be used, the other interfaces should be
297implemented as well.</p>
298
299</div>
300
301
302<!-- ======================================================================= -->
303<div class="doc_subsection">
304 <a name="targetdata">The <tt>TargetData</tt> class</a>
305</div>
306
307<div class="doc_text">
308
309<p>The <tt>TargetData</tt> class is the only required target description class,
310and it is the only class that is not extensible (it cannot be derived from). It
311specifies information about how the target lays out memory for structures, the
312alignment requirements for various data types, the size of pointers in the
313target, and whether the target is little- or big-endian.</p>
314
315</div>
316
Chris Lattneraa5bcb52005-01-28 17:22:53 +0000317<!-- ======================================================================= -->
318<div class="doc_subsection">
319 <a name="targetlowering">The <tt>TargetLowering</tt> class</a>
320</div>
321
322<div class="doc_text">
323
324<p>The <tt>TargetLowering</tt> class is used by SelectionDAG based instruction
325selectors primarily to describe how LLVM code should be lowered to SelectionDAG
326operations. Among other things, this class indicates an initial register class
327to use for various ValueTypes, which operations are natively supported by the
328target machine, and some other miscellaneous properties (such as the return type
329of setcc operations, the type to use for shift amounts, etc).</p>
330
331</div>
332
333
334
335
Chris Lattnerce52b7e2004-06-01 06:48:00 +0000336
337<!-- ======================================================================= -->
338<div class="doc_subsection">
Chris Lattner10d68002004-06-01 17:18:11 +0000339 <a name="mregisterinfo">The <tt>MRegisterInfo</tt> class</a>
Chris Lattnerce52b7e2004-06-01 06:48:00 +0000340</div>
341
342<div class="doc_text">
343
344<p>The <tt>MRegisterInfo</tt> class (which will eventually be renamed to
345<tt>TargetRegisterInfo</tt>) is used to describe the register file of the
346target and any interactions between the registers.</p>
347
348<p>Registers in the code generator are represented in the code generator by
349unsigned numbers. Physical registers (those that actually exist in the target
350description) are unique small numbers, and virtual registers are generally
351large.</p>
352
353<p>Each register in the processor description has an associated
354<tt>MRegisterDesc</tt> entry, which provides a textual name for the register
355(used for assembly output and debugging dumps), a set of aliases (used to
356indicate that one register overlaps with another), and some flag bits.
357</p>
358
359<p>In addition to the per-register description, the <tt>MRegisterInfo</tt> class
360exposes a set of processor specific register classes (instances of the
361<tt>TargetRegisterClass</tt> class). Each register class contains sets of
362registers that have the same properties (for example, they are all 32-bit
363integer registers). Each SSA virtual register created by the instruction
364selector has an associated register class. When the register allocator runs, it
365replaces virtual registers with a physical register in the set.</p>
366
367<p>
368The target-specific implementations of these classes is auto-generated from a <a
369href="TableGenFundamentals.html">TableGen</a> description of the register file.
370</p>
371
372</div>
373
374<!-- ======================================================================= -->
375<div class="doc_subsection">
Chris Lattner10d68002004-06-01 17:18:11 +0000376 <a name="targetinstrinfo">The <tt>TargetInstrInfo</tt> class</a>
Chris Lattnerce52b7e2004-06-01 06:48:00 +0000377</div>
378
379<!-- ======================================================================= -->
380<div class="doc_subsection">
Chris Lattner10d68002004-06-01 17:18:11 +0000381 <a name="targetframeinfo">The <tt>TargetFrameInfo</tt> class</a>
Chris Lattnerce52b7e2004-06-01 06:48:00 +0000382</div>
383
384<!-- ======================================================================= -->
385<div class="doc_subsection">
Chris Lattner10d68002004-06-01 17:18:11 +0000386 <a name="targetjitinfo">The <tt>TargetJITInfo</tt> class</a>
Chris Lattnerce52b7e2004-06-01 06:48:00 +0000387</div>
388
389<!-- *********************************************************************** -->
390<div class="doc_section">
391 <a name="codegendesc">Machine code description classes</a>
392</div>
393<!-- *********************************************************************** -->
394
Chris Lattnerec94f802004-06-04 00:16:02 +0000395<div class="doc_text">
Chris Lattnerce52b7e2004-06-01 06:48:00 +0000396
Chris Lattnerec94f802004-06-04 00:16:02 +0000397<p>
398At the high-level, LLVM code is translated to a machine specific representation
399formed out of MachineFunction, MachineBasicBlock, and <a
400href="#machineinstr"><tt>MachineInstr</tt></a> instances
401(defined in include/llvm/CodeGen). This representation is completely target
402agnostic, representing instructions in their most abstract form: an opcode and a
403series of operands. This representation is designed to support both SSA
404representation for machine code, as well as a register allocated, non-SSA form.
405</p>
406
407</div>
408
409<!-- ======================================================================= -->
410<div class="doc_subsection">
411 <a name="machineinstr">The <tt>MachineInstr</tt> class</a>
412</div>
413
414<div class="doc_text">
415
416<p>Target machine instructions are represented as instances of the
417<tt>MachineInstr</tt> class. This class is an extremely abstract way of
418representing machine instructions. In particular, all it keeps track of is
419an opcode number and some number of operands.</p>
420
421<p>The opcode number is an simple unsigned number that only has meaning to a
422specific backend. All of the instructions for a target should be defined in
423the <tt>*InstrInfo.td</tt> file for the target, and the opcode enum values
Chris Lattneraa5bcb52005-01-28 17:22:53 +0000424are auto-generated from this description. The <tt>MachineInstr</tt> class does
425not have any information about how to interpret the instruction (i.e., what the
Chris Lattnerec94f802004-06-04 00:16:02 +0000426semantics of the instruction are): for that you must refer to the
427<tt><a href="#targetinstrinfo">TargetInstrInfo</a></tt> class.</p>
428
429<p>The operands of a machine instruction can be of several different types:
430they can be a register reference, constant integer, basic block reference, etc.
431In addition, a machine operand should be marked as a def or a use of the value
432(though only registers are allowed to be defs).</p>
433
434<p>By convention, the LLVM code generator orders instruction operands so that
435all register definitions come before the register uses, even on architectures
Chris Lattneraa5bcb52005-01-28 17:22:53 +0000436that are normally printed in other orders. For example, the SPARC add
Chris Lattnerec94f802004-06-04 00:16:02 +0000437instruction: "<tt>add %i1, %i2, %i3</tt>" adds the "%i1", and "%i2" registers
438and stores the result into the "%i3" register. In the LLVM code generator,
439the operands should be stored as "<tt>%i3, %i1, %i2</tt>": with the destination
440first.</p>
441
442<p>Keeping destination operands at the beginning of the operand list has several
443advantages. In particular, the debugging printer will print the instruction
444like this:</p>
445
446<pre>
447 %r3 = add %i1, %i2
448</pre>
449
450<p>If the first operand is a def, and it is also easier to <a
451href="#buildmi">create instructions</a> whose only def is the first
452operand.</p>
453
454</div>
455
456<!-- _______________________________________________________________________ -->
457<div class="doc_subsubsection">
458 <a name="buildmi">Using the <tt>MachineInstrBuilder.h</tt> functions</a>
459</div>
460
461<div class="doc_text">
462
463<p>Machine instructions are created by using the <tt>BuildMI</tt> functions,
464located in the <tt>include/llvm/CodeGen/MachineInstrBuilder.h</tt> file. The
465<tt>BuildMI</tt> functions make it easy to build arbitrary machine
466instructions. Usage of the <tt>BuildMI</tt> functions look like this:
467</p>
468
469<pre>
470 // Create a 'DestReg = mov 42' (rendered in X86 assembly as 'mov DestReg, 42')
471 // instruction. The '1' specifies how many operands will be added.
472 MachineInstr *MI = BuildMI(X86::MOV32ri, 1, DestReg).addImm(42);
473
474 // Create the same instr, but insert it at the end of a basic block.
475 MachineBasicBlock &amp;MBB = ...
476 BuildMI(MBB, X86::MOV32ri, 1, DestReg).addImm(42);
477
478 // Create the same instr, but insert it before a specified iterator point.
479 MachineBasicBlock::iterator MBBI = ...
480 BuildMI(MBB, MBBI, X86::MOV32ri, 1, DestReg).addImm(42);
481
482 // Create a 'cmp Reg, 0' instruction, no destination reg.
483 MI = BuildMI(X86::CMP32ri, 2).addReg(Reg).addImm(0);
484 // Create an 'sahf' instruction which takes no operands and stores nothing.
485 MI = BuildMI(X86::SAHF, 0);
486
487 // Create a self looping branch instruction.
488 BuildMI(MBB, X86::JNE, 1).addMBB(&amp;MBB);
489</pre>
490
491<p>
492The key thing to remember with the <tt>BuildMI</tt> functions is that you have
493to specify the number of operands that the machine instruction will take
494(allowing efficient memory allocation). Also, if operands default to be uses
495of values, not definitions. If you need to add a definition operand (other
496than the optional destination register), you must explicitly mark it as such.
497</p>
498
499</div>
500
501<!-- _______________________________________________________________________ -->
502<div class="doc_subsubsection">
503 <a name="fixedregs">Fixed (aka preassigned) registers</a>
504</div>
505
506<div class="doc_text">
507
508<p>One important issue that the code generator needs to be aware of is the
509presence of fixed registers. In particular, there are often places in the
510instruction stream where the register allocator <em>must</em> arrange for a
511particular value to be in a particular register. This can occur due to
512limitations in the instruction set (e.g., the X86 can only do a 32-bit divide
513with the <tt>EAX</tt>/<tt>EDX</tt> registers), or external factors like calling
514conventions. In any case, the instruction selector should emit code that
515copies a virtual register into or out of a physical register when needed.</p>
516
517<p>For example, consider this simple LLVM example:</p>
518
519<pre>
520 int %test(int %X, int %Y) {
521 %Z = div int %X, %Y
522 ret int %Z
523 }
524</pre>
525
526<p>The X86 instruction selector produces this machine code for the div
527and ret (use
528"<tt>llc X.bc -march=x86 -print-machineinstrs</tt>" to get this):</p>
529
530<pre>
531 ;; Start of div
532 %EAX = mov %reg1024 ;; Copy X (in reg1024) into EAX
533 %reg1027 = sar %reg1024, 31
534 %EDX = mov %reg1027 ;; Sign extend X into EDX
535 idiv %reg1025 ;; Divide by Y (in reg1025)
536 %reg1026 = mov %EAX ;; Read the result (Z) out of EAX
537
538 ;; Start of ret
539 %EAX = mov %reg1026 ;; 32-bit return value goes in EAX
540 ret
541</pre>
542
Chris Lattneraa5bcb52005-01-28 17:22:53 +0000543<p>By the end of code generation, the register allocator has coalesced
Chris Lattnerec94f802004-06-04 00:16:02 +0000544the registers and deleted the resultant identity moves, producing the
545following code:</p>
546
547<pre>
548 ;; X is in EAX, Y is in ECX
549 mov %EAX, %EDX
550 sar %EDX, 31
551 idiv %ECX
552 ret
553</pre>
554
555<p>This approach is extremely general (if it can handle the X86 architecture,
556it can handle anything!) and allows all of the target specific
557knowledge about the instruction stream to be isolated in the instruction
558selector. Note that physical registers should have a short lifetime for good
559code generation, and all physical registers are assumed dead on entry and
560exit of basic blocks (before register allocation). Thus if you need a value
561to be live across basic block boundaries, it <em>must</em> live in a virtual
562register.</p>
563
564</div>
565
566<!-- _______________________________________________________________________ -->
567<div class="doc_subsubsection">
568 <a name="ssa">Machine code SSA form</a>
569</div>
570
571<div class="doc_text">
572
573<p><tt>MachineInstr</tt>'s are initially instruction selected in SSA-form, and
574are maintained in SSA-form until register allocation happens. For the most
575part, this is trivially simple since LLVM is already in SSA form: LLVM PHI nodes
576become machine code PHI nodes, and virtual registers are only allowed to have a
577single definition.</p>
578
579<p>After register allocation, machine code is no longer in SSA-form, as there
580are no virtual registers left in the code.</p>
581
582</div>
583
584<!-- *********************************************************************** -->
585<div class="doc_section">
Chris Lattneraa5bcb52005-01-28 17:22:53 +0000586 <a name="codegenalgs">Target-independent code generation algorithms</a>
587</div>
588<!-- *********************************************************************** -->
589
590<div class="doc_text">
591
592<p>This section documents the phases described in the <a
593href="high-level-design">high-level design of the code generator</a>. It
594explains how they work and some of the rationale behind their design.</p>
595
596</div>
597
598<!-- ======================================================================= -->
599<div class="doc_subsection">
600 <a name="instselect">Instruction Selection</a>
601</div>
602
603<div class="doc_text">
604<p>
605Instruction Selection is the process of translating the LLVM code input to the
606code generator into target-specific machine instructions. There are several
607well-known ways to do this in the literature. In LLVM there are two main forms:
608the old-style 'simple' instruction selector (which effectively peephole selects
609each LLVM instruction into a series of machine instructions), and the new
610SelectionDAG based instruction selector.
611</p>
612
613<p>The 'simple' instruction selectors are tedious to write, require a lot of
614boiler plate code, and are difficult to get correct. Additionally, any
615optimizations written for a simple instruction selector cannot be used by other
616targets. For this reason, LLVM is moving to a new SelectionDAG based
617instruction selector, which is described in this section. If you are starting a
618new port, we recommend that you write the instruction selector using the
619SelectionDAG infrastructure.</p>
620
621<p>In time, most of the target-specific code for instruction selection will be
622auto-generated from the target .td files. For now, however, the <a
623href="#selectiondag_select">Select Phase</a> must still be written by hand.</p>
624</div>
625
626<!-- _______________________________________________________________________ -->
627<div class="doc_subsubsection">
628 <a name="selectiondag_intro">Introduction to SelectionDAGs</a>
629</div>
630
631<div class="doc_text">
632
633<p>
634The SelectionDAG provides an abstraction for representing code in a way that is
635amenable to instruction selection using automatic techniques
636(e.g. dynamic-programming based optimal pattern matching selectors), as well as
637an abstraction that is useful for other phases of code generation (in
638particular, instruction scheduling). Additionally, the SelectionDAG provides a
639host representation where a large variety of very-low-level (but
640target-independent) <a href="#selectiondag_optimize">optimizations</a> may be
641performed: ones which require extensive information about the instructions
642efficiently supported by the target.
643</p>
644
645<p>
646The SelectionDAG is a Directed-Acyclic-Graph whose nodes are instances of the
647<tt>SDNode</tt> class. The primary payload of the Node is its operation code
648(Opcode) that indicates what the operation the node performs. The various
649operation node types are described at the top of the
650<tt>include/llvm/CodeGen/SelectionDAGNodes.h</tt> file. Depending on the operation, nodes may contain additional information (e.g. the condition code
651for a SETCC node) contained in a derived class.</p>
652
653<p>Each node in the graph may define multiple values
654(e.g. for a combined div/rem operation and many other situations), though most
655operations define a single value. Each node also has some number of operands,
656which are edges to the node defining the used value. Because nodes may define
657multiple values, edges are represented by instances of the <tt>SDOperand</tt>
658class, which is a &lt;SDNode, unsigned&gt; pair, indicating the node and result
659value being used. Each value produced by a SDNode has an associated
660MVT::ValueType, indicating what type the value is.
661</p>
662
663<p>
664SelectionDAGs contain two different kinds of value: those that represent data
665flow and those that represent control flow dependencies. Data values are simple
666edges with a integer or floating point value type. Control edges are
667represented as "chain" edges which are of type MVT::Other. These edges provide
668an ordering between nodes that have side effects (such as
669loads/stores/calls/return/etc). All nodes that have side effects should take a
670token chain as input and produce a new one as output. By convention, token
671chain inputs are always operand #0, and chain results are always the last
672value produced by an operation.</p>
673
674<p>
675A SelectionDAG has designated "Entry" and "Root" nodes. The Entry node is
676always a marker node with Opcode of ISD::TokenFactor. The Root node is the
677final side effecting node in the token chain (for example, in a single basic
678block function, this would be the return node).
679</p>
680
681<p>
682One important concept for SelectionDAGs is the notion of a "legal" vs "illegal"
683DAG. A legal DAG for a target is one that only uses supported operations and
684supported types. On PowerPC, for example, a DAG with any values of i1, i8, i16,
685or i64 type would be illegal. The <a href="#selectiondag_legalize">legalize</a>
686phase is the one responsible for turning an illegal DAG into a legal DAG.
687</p>
688</div>
689
690<!-- _______________________________________________________________________ -->
691<div class="doc_subsubsection">
692 <a name="selectiondag_process">SelectionDAG Code Generation Process</a>
693</div>
694
695<div class="doc_text">
696
697<p>
698SelectionDAG-based instruction selection consists of the following steps:
699</p>
700
701<ol>
702<li><a href="#selectiondag_build">Build initial DAG</a> - This stage performs
703 a simple translation from the input LLVM code to an illegal SelectionDAG.
704 </li>
705<li><a href="#selectiondag_optimize">Optimize SelectionDAG</a> - This stage
706 performs simple optimizations on the SelectionDAG to simplify it and
707 recognize meta instructions (like rotates and div/rem pairs) for
708 targets that support these meta operations. This makes the resultant code
709 more efficient and the 'select' phase more simple.
710</li>
711<li><a href="#selectiondag_legalize">Legalize SelectionDAG</a> - This stage
712 converts the illegal SelectionDAG to a legal SelectionDAG, by eliminating
713 unsupported operations and data types.</li>
714<li><a href="#selectiondag_optimize">Optimize SelectionDAG (#2)</a> - This
715 second run of the SelectionDAG optimized the newly legalized DAG, to
716 eliminate inefficiencies introduced by legalization.</li>
717<li><a href="#selectiondag_select">Select instructions from DAG</a> - Finally,
718 the target instruction selector matches the DAG operations to target
719 instructions, emitting them and building the MachineFunction being
720 compiled.</li>
721</ol>
722
723<p>After all of these steps are complete, the SelectionDAG is destroyed and the
724rest of the code generation passes are run.</p>
725
726</div>
727
728<!-- _______________________________________________________________________ -->
729<div class="doc_subsubsection">
730 <a name="selectiondag_build">Initial SelectionDAG Construction</a>
731</div>
732
733<div class="doc_text">
734
735<p>
736The initial SelectionDAG is naively peephole expanded from the LLVM input by
737the SelectionDAGLowering class in the SelectionDAGISel.cpp file. The idea of
738doing this pass is to expose as much low-level target-specific details to the
739SelectionDAG as possible. This pass is mostly hard-coded (e.g. an LLVM add
740turns into a SDNode add, a geteelementptr is expanded into the obvious
741arithmetic, etc) but does require target-specific hooks to lower calls and
742returns, varargs, etc. For these features, the TargetLowering interface is
743used.
744</p>
745
746</div>
747
748<!-- _______________________________________________________________________ -->
749<div class="doc_subsubsection">
750 <a name="selectiondag_legalize">SelectionDAG Legalize Phase</a>
751</div>
752
753<div class="doc_text">
754
755<p>The Legalize phase is in charge of converting a DAG to only use the types and
756operations that are natively supported by the target. This involves two major
757tasks:</p>
758
759<ol>
760<li><p>Convert values of unsupported types to values of supported types.</p>
761 <p>There are two main ways of doing this: promoting a small type to a larger
762 type (e.g. f32 -> f64, or i16 -> i32), and expanding larger integer types
763 to smaller ones (e.g. implementing i64 with i32 operations where
764 possible). Promotion insert sign and zero extensions as needed to make
765 sure that the final code has the same behavior as the input.</p>
766</li>
767
768<li><p>Eliminate operations that are not supported by the target in a supported
769 type.</p>
770 <p>Targets often have wierd constraints, such as not supporting every
771 operation on every supported datatype (e.g. X86 does not support byte
772 conditional moves). Legalize takes care of either open-coding another
773 sequence of operations to emulate the operation (this is known as
774 expansion), promoting to a larger type that supports the operation
775 (promotion), or can use a target-specific hook to implement the
776 legalization.</p>
777</li>
778</ol>
779
780<p>
781Instead of using a Legalize pass, we could require that every target-specific
782<a href="#selectiondag_optimize">selector</a> support and expand every operator
783and type even if they are not supported and may require many instructions to
784implement (in fact, this is the approach taken by the "simple" selectors).
785However, using a Legalize pass allows all of the cannonicalization patterns to
786be shared across targets, and makes it very easy to optimize the cannonicalized
787code (because it is still in the form of a DAG).
788</p>
789
790</div>
791
792<!-- _______________________________________________________________________ -->
793<div class="doc_subsubsection">
794 <a name="selectiondag_optimize">SelectionDAG Optimization Phase</a>
795</div>
796
797<div class="doc_text">
798
799<p>
800The SelectionDAG optimization phase is run twice for code generation: once
801immediately after the DAG is built and once after legalization. The first pass
802allows the initial code to be cleaned up, (for example) performing optimizations
803that depend on knowing that the operators have restricted type inputs. The second
804pass cleans up the messy code generated by the Legalize pass, allowing Legalize to
805be very simple (not having to take into account many special cases.
806</p>
807
808<p>
809One important class of optimizations that this pass will do in the future is
810optimizing inserted sign and zero extension instructions. Here are some good
811papers on the subject:</p>
812
813<p>
814"<a href="http://www.eecs.harvard.edu/~nr/pubs/widen-abstract.html">Widening
815integer arithmetic</a>"<br>
816Kevin Redwine and Norman Ramsey<br>
817International Conference on Compiler Construction (CC) 2004
818</p>
819
820
821<p>
822 "<a href="http://portal.acm.org/citation.cfm?doid=512529.512552">Effective
823 sign extension elimination</a>"<br>
824 Motohiro Kawahito, Hideaki Komatsu, and Toshio Nakatani<br>
825 Proceedings of the ACM SIGPLAN 2002 Conference on Programming Language Design
826 and Implementation.
827</p>
828
829</div>
830
831<!-- _______________________________________________________________________ -->
832<div class="doc_subsubsection">
833 <a name="selectiondag_select">SelectionDAG Select Phase</a>
834</div>
835
836<div class="doc_text">
837
838<p>The Select phase is the bulk of the target-specific code for instruction
839selection. This phase takes a legal SelectionDAG as input, and does simple
840pattern matching on the DAG to generate code. In time, the Select phase will
841be automatically generated from the targets InstrInfo.td file, which is why we
842want to make the Select phase a simple and mechanical as possible.</p>
843
844</div>
845
846<!-- _______________________________________________________________________ -->
847<div class="doc_subsubsection">
848 <a name="selectiondag_future">Future directions for the SelectionDAG</a>
849</div>
850
851<div class="doc_text">
852
853<ol>
854<li>Optional whole-function selection.</li>
855<li>Select is a graph translation phase.</li>
856<li>Place the machine instrs resulting from Select according to register pressure or a schedule.</li>
857<li>DAG Scheduling.</li>
858<li>Auto-generate the Select phase from the target .td files.</li>
859</ol>
860
861</div>
862
863
864<!-- *********************************************************************** -->
865<div class="doc_section">
Chris Lattnerec94f802004-06-04 00:16:02 +0000866 <a name="targetimpls">Target description implementations</a>
867</div>
868<!-- *********************************************************************** -->
869
870<div class="doc_text">
871
872<p>This section of the document explains any features or design decisions that
873are specific to the code generator for a particular target.</p>
874
875</div>
876
877
878<!-- ======================================================================= -->
879<div class="doc_subsection">
880 <a name="x86">The X86 backend</a>
881</div>
882
883<div class="doc_text">
884
885<p>
886The X86 code generator lives in the <tt>lib/Target/X86</tt> directory. This
887code generator currently targets a generic P6-like processor. As such, it
888produces a few P6-and-above instructions (like conditional moves), but it does
889not make use of newer features like MMX or SSE. In the future, the X86 backend
Chris Lattneraa5bcb52005-01-28 17:22:53 +0000890will have sub-target support added for specific processor families and
Chris Lattnerec94f802004-06-04 00:16:02 +0000891implementations.</p>
892
893</div>
894
895<!-- _______________________________________________________________________ -->
896<div class="doc_subsubsection">
897 <a name="x86_memory">Representing X86 addressing modes in MachineInstrs</a>
898</div>
899
900<div class="doc_text">
901
Misha Brukman600df452005-02-17 22:22:24 +0000902<p>The x86 has a very flexible way of accessing memory. It is capable of
Chris Lattnerec94f802004-06-04 00:16:02 +0000903forming memory addresses of the following expression directly in integer
904instructions (which use ModR/M addressing):</p>
905
906<pre>
907 Base+[1,2,4,8]*IndexReg+Disp32
908</pre>
909
Misha Brukman600df452005-02-17 22:22:24 +0000910<p>In order to represent this, LLVM tracks no less than 4 operands for each
911memory operand of this form. This means that the "load" form of 'mov' has the
912following <tt>MachineOperand</tt>s in this order:</p>
Chris Lattnerec94f802004-06-04 00:16:02 +0000913
914<pre>
915Index: 0 | 1 2 3 4
916Meaning: DestReg, | BaseReg, Scale, IndexReg, Displacement
917OperandTy: VirtReg, | VirtReg, UnsImm, VirtReg, SignExtImm
918</pre>
919
920<p>Stores and all other instructions treat the four memory operands in the same
921way, in the same order.</p>
Chris Lattnerec94f802004-06-04 00:16:02 +0000922
923</div>
924
925<!-- _______________________________________________________________________ -->
926<div class="doc_subsubsection">
927 <a name="x86_names">Instruction naming</a>
928</div>
929
930<div class="doc_text">
931
932<p>
933An instruction name consists of the base name, a default operand size
934followed by a character per operand with an optional special size. For
935example:</p>
936
937<p>
938<tt>ADD8rr</tt> -&gt; add, 8-bit register, 8-bit register<br>
939<tt>IMUL16rmi</tt> -&gt; imul, 16-bit register, 16-bit memory, 16-bit immediate<br>
940<tt>IMUL16rmi8</tt> -&gt; imul, 16-bit register, 16-bit memory, 8-bit immediate<br>
941<tt>MOVSX32rm16</tt> -&gt; movsx, 32-bit register, 16-bit memory
942</p>
943
944</div>
Chris Lattnerce52b7e2004-06-01 06:48:00 +0000945
946<!-- *********************************************************************** -->
947<hr>
948<address>
949 <a href="http://jigsaw.w3.org/css-validator/check/referer"><img
950 src="http://jigsaw.w3.org/css-validator/images/vcss" alt="Valid CSS!"></a>
951 <a href="http://validator.w3.org/check/referer"><img
952 src="http://www.w3.org/Icons/valid-html401" alt="Valid HTML 4.01!" /></a>
953
954 <a href="mailto:sabre@nondot.org">Chris Lattner</a><br>
955 <a href="http://llvm.cs.uiuc.edu">The LLVM Compiler Infrastructure</a><br>
956 Last modified: $Date$
957</address>
958
959</body>
960</html>