Blame - docs/WritingAnLLVMBackend.html - fp2-dev/platform/external/llvm

blob: a19f6494de8492066d5c8e9c9a673fbbe3385177 [file] [log] [blame]

Misha Brukman	8eb6719	2004-09-06 22:58:13 +0000	[diff] [blame]	1	<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01//EN"
				2	"http://www.w3.org/TR/html4/strict.dtd">
				3	<html>
				4	<head>
Chris Lattner	7897538	2008-11-11 19:30:41 +0000	[diff] [blame^]	5	<title>Writing an LLVM Compiler Backend</title>
Misha Brukman	8eb6719	2004-09-06 22:58:13 +0000	[diff] [blame]	6	<link rel="stylesheet" href="llvm.css" type="text/css">
				7	</head>
				8
				9	<body>
				10
Chris Lattner	7897538	2008-11-11 19:30:41 +0000	[diff] [blame^]	11	<div class="doc_title"><p>
				12	Writing an LLVM Compiler Backend
Misha Brukman	8eb6719	2004-09-06 22:58:13 +0000	[diff] [blame]	13	</div>
				14
				15	<ol>
				16	<li><a href="#intro">Introduction</a>
Chris Lattner	7897538	2008-11-11 19:30:41 +0000	[diff] [blame^]	17	<ul>
				18	<li><a href="#Audience">Audience</a></li>
				19	<li><a href="#Prerequisite">Prerequisite Reading</a></li>
				20	<li><a href="#Basic">Basic Steps</a></li>
				21	<li><a href="#Preliminaries">Preliminaries</a></li>
				22	</ul>
				23	<li><a href="#TargetMachine">Target Machine</a></li>
				24	<li><a href="#RegisterSet">Register Set and Register Classes</a></li>
				25	<ul>
				26	<li><a href="#RegisterDef">Defining a Register</a></li>
				27	<li><a href="#RegisterClassDef">Defining a Register Class</a></li>
				28	<li><a href="#implementRegister">Implement a subclass of TargetRegisterInfo</a></li>
				29	</ul>
				30	<li><a href="#InstructionSet">Instruction Set</a></li>
				31	<ul>
				32	<li><a href="#implementInstr">Implement a subclass of TargetInstrInfo</a></li>
				33	<li><a href="#branchFolding">Branch Folding and If Conversion</a></li>
				34	</ul>
				35	<li><a href="#InstructionSelector">Instruction Selector</a></li>
				36	<ul>
				37	<li><a href="#LegalizePhase">The SelectionDAG Legalize Phase</a></li>
				38	<ul>
				39	<li><a href="#promote">Promote</a></li>
				40	<li><a href="#expand">Expand</a></li>
				41	<li><a href="#custom">Custom</a></li>
				42	<li><a href="#legal">Legal</a></li>
				43	</ul>
				44	<li><a href="#callingConventions">Calling Conventions</a></li>
				45	</ul>
				46	<li><a href="#assemblyPrinter">Assembly Printer</a></li>
				47	<li><a href="#subtargetSupport">Subtarget Support</a></li>
				48	<li><a href="#jitSupport">JIT Support</a></li>
				49	<ul>
				50	<li><a href="#mce">Machine Code Emitter</a></li>
				51	<li><a href="#targetJITInfo">Target JIT Info</a></li>
				52	</ul>
Misha Brukman	8eb6719	2004-09-06 22:58:13 +0000	[diff] [blame]	53	</ol>
				54
				55	<div class="doc_author">
Chris Lattner	7897538	2008-11-11 19:30:41 +0000	[diff] [blame^]	56	<p>Written by <a href="http://www.woo.com">Mason Woo</a> and <a href="http://misha.brukman.net">Misha Brukman</a></p>
Misha Brukman	8eb6719	2004-09-06 22:58:13 +0000	[diff] [blame]	57	</div>
				58
				59	<!-- *********************************************************************** -->
				60	<div class="doc_section">
				61	<a name="intro">Introduction</a>
				62	</div>
				63	<!-- *********************************************************************** -->
				64
				65	<div class="doc_text">
Chris Lattner	7897538	2008-11-11 19:30:41 +0000	[diff] [blame^]	66	<p>This document describes techniques for writing compiler backends
				67	that convert the LLVM IR (intermediate representation) to code for a specified
				68	machine or other languages. Code intended for a specific machine can take the
				69	form of either assembly code or binary code (usable for a JIT compiler). </p>
Misha Brukman	8eb6719	2004-09-06 22:58:13 +0000	[diff] [blame]	70
Chris Lattner	7897538	2008-11-11 19:30:41 +0000	[diff] [blame^]	71	<p>The backend of LLVM features a target-independent code generator
				72	that may create output for several types of target CPUs, including X86,
				73	PowerPC, Alpha, and SPARC. The backend may also be used to generate code
				74	targeted at SPUs of the Cell processor or GPUs to support the execution of
				75	compute kernels.</p>
Misha Brukman	8eb6719	2004-09-06 22:58:13 +0000	[diff] [blame]	76
Chris Lattner	7897538	2008-11-11 19:30:41 +0000	[diff] [blame^]	77	<p>The document focuses on existing examples found in subdirectories
				78	of <tt>llvm/lib/Target</tt> in a downloaded LLVM release. In particular, this document
				79	focuses on the example of creating a static compiler (one that emits text
				80	assembly) for a SPARC target, because SPARC has fairly standard
				81	characteristics, such as a RISC instruction set and straightforward calling
				82	conventions.</p>
Misha Brukman	8eb6719	2004-09-06 22:58:13 +0000	[diff] [blame]	83	</div>
				84
Misha Brukman	8eb6719	2004-09-06 22:58:13 +0000	[diff] [blame]	85	<div class="doc_subsection">
Chris Lattner	7897538	2008-11-11 19:30:41 +0000	[diff] [blame^]	86	<a name="Audience">Audience</a>
				87	</div>
Misha Brukman	8eb6719	2004-09-06 22:58:13 +0000	[diff] [blame]	88
				89	<div class="doc_text">
Chris Lattner	7897538	2008-11-11 19:30:41 +0000	[diff] [blame^]	90	<p>The audience for this document is anyone who needs to write an
				91	LLVM backend to generate code for a specific hardware or software target.</p>
				92	</div>
Misha Brukman	8eb6719	2004-09-06 22:58:13 +0000	[diff] [blame]	93
Chris Lattner	7897538	2008-11-11 19:30:41 +0000	[diff] [blame^]	94	<div class="doc_subsection">
				95	<a name="Prerequisite">Prerequisite Reading</a>
				96	</div>
Misha Brukman	8eb6719	2004-09-06 22:58:13 +0000	[diff] [blame]	97
Chris Lattner	7897538	2008-11-11 19:30:41 +0000	[diff] [blame^]	98	<div class="doc_text">
				99	These essential documents must be read before reading this document:
				100	<ul>
				101	<li>
				102	<it><a href="http://www.llvm.org/docs/LangRef.html">LLVM Language Reference Manual</a></it> -
				103	a reference manual for the LLVM assembly language
				104	</li>
				105	<li>
				106	<it><a href="http://www.llvm.org/docs/CodeGenerator.html">The LLVM Target-Independent Code Generator </a></it> -
				107	a guide to the components (classes and code generation algorithms) for translating
				108	the LLVM internal representation to the machine code for a specified target.
				109	Pay particular attention to the descriptions of code generation stages:
				110	Instruction Selection, Scheduling and Formation, SSA-based Optimization,
				111	Register Allocation, Prolog/Epilog Code Insertion, Late Machine Code Optimizations,
				112	and Code Emission.
				113	</li>
				114	<li>
				115	<it><a href="http://www.llvm.org/docs/TableGenFundamentals.html">TableGen Fundamentals</a></it> -
				116	a document that describes the TableGen (tblgen) application that manages domain-specific
				117	information to support LLVM code generation. TableGen processes input from a
				118	target description file (.td suffix) and generates C++ code that can be used
				119	for code generation.
				120	</li>
				121	<li>
				122	<it><a href="http://www.llvm.org/docs/WritingAnLLVMPass.html">Writing an LLVM Pass</a></it> -
				123	The assembly printer is a FunctionPass, as are several SelectionDAG processing steps.
				124	</li>
				125	</ul>
				126	To follow the SPARC examples in this document, have a copy of
				127	<it><a href="http://www.sparc.org/standards/V8.pdf">The SPARC Architecture Manual, Version 8</a></it>
				128	for reference. For details about the ARM instruction set, refer to the
				129	<it><a href="http://infocenter.arm.com/">ARM Architecture Reference Manual</a></it>
				130	For more about the GNU Assembler format (GAS), see
				131	<it><a href="http://sourceware.org/binutils/docs/as/index.html">Using As</a></it>
				132	especially for the assembly printer. <it>Using As</it> contains lists of target machine dependent features.
				133	</div>
				134
				135	<div class="doc_subsection">
				136	<a name="Basic">Basic Steps</a>
				137	</div>
				138	<div class="doc_text">
				139	<p>To write a compiler
				140	backend for LLVM that converts the LLVM IR (intermediate representation)
				141	to code for a specified target (machine or other language), follow these steps:</p>
Misha Brukman	8eb6719	2004-09-06 22:58:13 +0000	[diff] [blame]	142
				143	<ul>
Chris Lattner	7897538	2008-11-11 19:30:41 +0000	[diff] [blame^]	144	<li>
				145	Create a subclass of the TargetMachine class that describes
				146	characteristics of your target machine. Copy existing examples of specific
				147	TargetMachine class and header files; for example, start with <tt>SparcTargetMachine.cpp</tt>
				148	and <tt>SparcTargetMachine.h</tt>, but change the file names for your target. Similarly,
				149	change code that references "Sparc" to reference your target. </li>
Misha Brukman	8eb6719	2004-09-06 22:58:13 +0000	[diff] [blame]	150
Chris Lattner	7897538	2008-11-11 19:30:41 +0000	[diff] [blame^]	151	<li>Describe the register set of the target. Use TableGen to generate
				152	code for register definition, register aliases, and register classes from a
				153	target-specific <tt>RegisterInfo.td</tt> input file. You should also write additional
				154	code for a subclass of TargetRegisterInfo class that represents the class
				155	register file data used for register allocation and also describes the
				156	interactions between registers.</li>
Misha Brukman	8eb6719	2004-09-06 22:58:13 +0000	[diff] [blame]	157
Chris Lattner	7897538	2008-11-11 19:30:41 +0000	[diff] [blame^]	158	<li>Describe the instruction set of the target. Use TableGen to
				159	generate code for target-specific instructions from target-specific versions of
				160	<tt>TargetInstrFormats.td</tt> and <tt>TargetInstrInfo.td</tt>. You should write additional code
				161	for a subclass of the TargetInstrInfo
				162	class to represent machine
				163	instructions supported by the target machine. </li>
Misha Brukman	8eb6719	2004-09-06 22:58:13 +0000	[diff] [blame]	164
Chris Lattner	7897538	2008-11-11 19:30:41 +0000	[diff] [blame^]	165	<li>Describe the selection and conversion of the LLVM IR from a DAG (directed
				166	acyclic graph) representation of instructions to native target-specific
				167	instructions. Use TableGen to generate code that matches patterns and selects
				168	instructions based on additional information in a target-specific version of
				169	<tt>TargetInstrInfo.td</tt>. Write code for <tt>XXXISelDAGToDAG.cpp</tt>
				170	(where XXX identifies the specific target) to perform pattern
				171	matching and DAG-to-DAG instruction selection. Also write code in <tt>XXXISelLowering.cpp</tt>
				172	to replace or remove operations and data types that are not supported natively
				173	in a SelectionDAG. </li>
Misha Brukman	8eb6719	2004-09-06 22:58:13 +0000	[diff] [blame]	174
Chris Lattner	7897538	2008-11-11 19:30:41 +0000	[diff] [blame^]	175	<li>Write code for an
				176	assembly printer that converts LLVM IR to a GAS format for your target machine.
				177	You should add assembly strings to the instructions defined in your
				178	target-specific version of <tt>TargetInstrInfo.td</tt>. You should also write code for a
				179	subclass of AsmPrinter that performs the LLVM-to-assembly conversion and a
				180	trivial subclass of TargetAsmInfo.</li>
Misha Brukman	8eb6719	2004-09-06 22:58:13 +0000	[diff] [blame]	181
Chris Lattner	7897538	2008-11-11 19:30:41 +0000	[diff] [blame^]	182	<li>Optionally, add support for subtargets (that is, variants with
				183	different capabilities). You should also write code for a subclass of the
				184	TargetSubtarget class, which allows you to use the <tt>-mcpu=</tt>
				185	and <tt>-mattr=</tt> command-line options.</li>
Misha Brukman	8eb6719	2004-09-06 22:58:13 +0000	[diff] [blame]	186
Chris Lattner	7897538	2008-11-11 19:30:41 +0000	[diff] [blame^]	187	<li>Optionally, add JIT support and create a machine code emitter (subclass
				188	of TargetJITInfo) that is used to emit binary code directly into memory. </li>
Misha Brukman	8eb6719	2004-09-06 22:58:13 +0000	[diff] [blame]	189	</ul>
				190
Chris Lattner	7897538	2008-11-11 19:30:41 +0000	[diff] [blame^]	191	<p>In the .cpp and .h files, initially stub up these methods and
				192	then implement them later. Initially, you may not know which private members
				193	that the class will need and which components will need to be subclassed.</p>
Misha Brukman	8eb6719	2004-09-06 22:58:13 +0000	[diff] [blame]	194	</div>
				195
Misha Brukman	8eb6719	2004-09-06 22:58:13 +0000	[diff] [blame]	196	<div class="doc_subsection">
Chris Lattner	7897538	2008-11-11 19:30:41 +0000	[diff] [blame^]	197	<a name="Preliminaries">Preliminaries</a>
Misha Brukman	8eb6719	2004-09-06 22:58:13 +0000	[diff] [blame]	198	</div>
Misha Brukman	8eb6719	2004-09-06 22:58:13 +0000	[diff] [blame]	199	<div class="doc_text">
Chris Lattner	7897538	2008-11-11 19:30:41 +0000	[diff] [blame^]	200	<p>To actually create
				201	your compiler backend, you need to create and modify a few files. The absolute
				202	minimum is discussed here, but to actually use the LLVM target-independent code
				203	generator, you must perform the steps described in the <a
				204	href="http://www.llvm.org/docs/CodeGenerator.html">LLVM
				205	Target-Independent Code Generator</a> document.</p>
Misha Brukman	8eb6719	2004-09-06 22:58:13 +0000	[diff] [blame]	206
Chris Lattner	7897538	2008-11-11 19:30:41 +0000	[diff] [blame^]	207	<p>First, you should
				208	create a subdirectory under <tt>lib/Target</tt> to hold all the files related to your
				209	target. If your target is called "Dummy", create the directory
Matthijs Kooijman	6aa8127	2008-09-29 11:52:22 +0000	[diff] [blame]	210	<tt>lib/Target/Dummy</tt>.</p>
				211
Chris Lattner	7897538	2008-11-11 19:30:41 +0000	[diff] [blame^]	212	<p>In this new
				213	directory, create a <tt>Makefile</tt>. It is easiest to copy a <tt>Makefile</tt> of another
				214	target and modify it. It should at least contain the <tt>LEVEL</tt>, <tt>LIBRARYNAME</tt> and
				215	<tt>TARGET</tt> variables, and then include <tt>$(LEVEL)/Makefile.common</tt>. The library can be
				216	named LLVMDummy (for example, see the MIPS target). Alternatively, you can
				217	split the library into LLVMDummyCodeGen and LLVMDummyAsmPrinter, the latter of
				218	which should be implemented in a subdirectory below <tt>lib/Target/Dummy</tt> (for
				219	example, see the PowerPC target).</p>
Matthijs Kooijman	6aa8127	2008-09-29 11:52:22 +0000	[diff] [blame]	220
Chris Lattner	7897538	2008-11-11 19:30:41 +0000	[diff] [blame^]	221	<p>Note that these two
				222	naming schemes are hardcoded into <tt>llvm-config</tt>. Using any other naming scheme
				223	will confuse <tt>llvm-config</tt> and produce lots of (seemingly unrelated) linker
				224	errors when linking <tt>llc</tt>.</p>
Matthijs Kooijman	6aa8127	2008-09-29 11:52:22 +0000	[diff] [blame]	225
Chris Lattner	7897538	2008-11-11 19:30:41 +0000	[diff] [blame^]	226	<p>To make your target
				227	actually do something, you need to implement a subclass of TargetMachine. This
				228	implementation should typically be in the file
				229	<tt>lib/Target/DummyTargetMachine.cpp</tt>, but any file in the <tt>lib/Target</tt> directory will
				230	be built and should work. To use LLVM's target
				231	independent code generator, you should do what all current machine backends do: create a subclass
				232	of LLVMTargetMachine. (To create a target from scratch, create a subclass of
				233	TargetMachine.)</p>
Matthijs Kooijman	6aa8127	2008-09-29 11:52:22 +0000	[diff] [blame]	234
Chris Lattner	7897538	2008-11-11 19:30:41 +0000	[diff] [blame^]	235	<p>To get LLVM to
				236	actually build and link your target, you need to add it to the <tt>TARGETS_TO_BUILD</tt>
				237	variable. To do this, you modify the configure script to know about your target
				238	when parsing the <tt>--enable-targets</tt> option. Search the configure script for <tt>TARGETS_TO_BUILD</tt>,
				239	add your target to the lists there (some creativity required) and then
				240	reconfigure. Alternatively, you can change <tt>autotools/configure.ac</tt> and
				241	regenerate configure by running <tt>./autoconf/AutoRegen.sh</tt></p>
Matthijs Kooijman	6aa8127	2008-09-29 11:52:22 +0000	[diff] [blame]	242	</div>
Misha Brukman	8eb6719	2004-09-06 22:58:13 +0000	[diff] [blame]	243
				244	<!-- *********************************************************************** -->
				245	<div class="doc_section">
Chris Lattner	7897538	2008-11-11 19:30:41 +0000	[diff] [blame^]	246	<a name="TargetMachine">Target Machine</a>
				247	</div>
				248	<!-- *********************************************************************** -->
				249	<div class="doc_text">
				250	<p>LLVMTargetMachine is designed as a base class for targets
				251	implemented with the LLVM target-independent code generator. The
				252	LLVMTargetMachine class should be specialized by a concrete target class that
				253	implements the various virtual methods. LLVMTargetMachine is defined as a
				254	subclass of TargetMachine in <tt>include/llvm/Target/TargetMachine.h</tt>. The
				255	TargetMachine class implementation (<tt>TargetMachine.cpp</tt>) also processes numerous
				256	command-line options. </p>
				257
				258	<p>To create a concrete target-specific subclass of
				259	LLVMTargetMachine, start by copying an existing TargetMachine class and header.
				260	You should name the files that you create to reflect your specific target. For
				261	instance, for the SPARC target, name the files <tt>SparcTargetMachine.h</tt> and
				262	<tt>SparcTargetMachine.cpp</tt></p>
				263
				264	<p>For a target machine XXX, the implementation of XXXTargetMachine
				265	must have access methods to obtain objects that represent target components.
				266	These methods are named <tt>get*Info</tt> and are intended to obtain the instruction set
				267	(<tt>getInstrInfo</tt>), register set (<tt>getRegisterInfo</tt>), stack frame layout
				268	(<tt>getFrameInfo</tt>), and similar information. XXXTargetMachine must also implement
				269	the <tt>getTargetData</tt> method to access an object with target-specific data
				270	characteristics, such as data type size and alignment requirements. </p>
				271
				272	<p>For instance, for the SPARC target, the header file <tt>SparcTargetMachine.h</tt>
				273	declares prototypes for several <tt>get*Info</tt> and <tt>getTargetData</tt> methods that simply
				274	return a class member. </p>
				275	</div>
				276
				277	<div class="doc_code">
				278	<pre>namespace llvm {
				279
				280	class Module;
				281
				282	class SparcTargetMachine : public LLVMTargetMachine {
				283	const TargetData DataLayout; // Calculates type size & alignment
				284	SparcSubtarget Subtarget;
				285	SparcInstrInfo InstrInfo;
				286	TargetFrameInfo FrameInfo;
				287
				288	protected:
				289	virtual const TargetAsmInfo *createTargetAsmInfo()
				290	const;
				291
				292	public:
				293	SparcTargetMachine(const Module &M, const std::string &FS);
				294
				295	virtual const SparcInstrInfo *getInstrInfo() const {return &InstrInfo; }
				296	virtual const TargetFrameInfo *getFrameInfo() const {return &FrameInfo; }
				297	virtual const TargetSubtarget *getSubtargetImpl() const{return &Subtarget; }
				298	virtual const TargetRegisterInfo *getRegisterInfo() const {
				299	return &InstrInfo.getRegisterInfo();
				300	}
				301	virtual const TargetData *getTargetData() const { return &DataLayout; }
				302	static unsigned getModuleMatchQuality(const Module &M);
				303
				304	// Pass Pipeline Configuration
				305	virtual bool addInstSelector(PassManagerBase &PM, bool Fast);
				306	virtual bool addPreEmitPass(PassManagerBase &PM, bool Fast);
				307	virtual bool addAssemblyEmitter(PassManagerBase &PM, bool Fast,
				308	std::ostream &Out);
				309	};
				310
				311	} // end namespace llvm
				312	</pre>
				313	</div>
				314
				315	<div class="doc_text">
				316	<ul>
				317	<li><tt>getInstrInfo </tt></li>
				318	<li><tt>getRegisterInfo</tt></li>
				319	<li><tt>getFrameInfo</tt></li>
				320	<li><tt>getTargetData</tt></li>
				321	<li><tt>getSubtargetImpl</tt></li>
				322	</ul>
				323	<p>For some targets, you also need to support the following methods:
				324	</p>
				325
				326	<ul>
				327	<li><tt>getTargetLowering </tt></li>
				328	<li><tt>getJITInfo</tt></li>
				329	</ul>
				330	<p>In addition, the XXXTargetMachine constructor should specify a
				331	TargetDescription string that determines the data layout for the target machine,
				332	including characteristics such as pointer size, alignment, and endianness. For
				333	example, the constructor for SparcTargetMachine contains the following: </p>
				334	</div>
				335
				336	<div class="doc_code">
				337	<pre>
				338	SparcTargetMachine::SparcTargetMachine(const Module &M, const std::string &FS)
				339	: DataLayout("E-p:32:32-f128:128:128"),
				340	Subtarget(M, FS), InstrInfo(Subtarget),
				341	FrameInfo(TargetFrameInfo::StackGrowsDown, 8, 0) {
				342	}
				343	</pre>
				344	</div>
				345
				346	<div class="doc_text">
				347	<p>Hyphens separate portions of the TargetDescription string. </p>
				348	<ul>
				349	<li>The "E" in the string indicates a big-endian target data model; a
				350	lower-case "e" would indicate little-endian. </li>
				351	<li>"p:" is followed by pointer information: size, ABI alignment, and
				352	preferred alignment. If only two figures follow "p:", then the first value is
				353	pointer size, and the second value is both ABI and preferred alignment.</li>
				354	<li>then a letter for numeric type alignment: "i", "f", "v", or "a"
				355	(corresponding to integer, floating point, vector, or aggregate). "i", "v", or
				356	"a" are followed by ABI alignment and preferred alignment. "f" is followed by
				357	three values, the first indicates the size of a long double, then ABI alignment
				358	and preferred alignment.</li>
				359	</ul>
				360	<p>You must also register your target using the RegisterTarget
				361	template. (See the TargetMachineRegistry class.) For example, in <tt>SparcTargetMachine.cpp</tt>,
				362	the target is registered with:</p>
				363	</div>
				364
				365	<div class="doc_code">
				366	<pre>
				367	namespace {
				368	// Register the target.
				369	RegisterTarget<SparcTargetMachine>X("sparc", "SPARC");
				370	}
				371	</pre>
				372	</div>
				373
				374	<!-- *********************************************************************** -->
				375	<div class="doc_section">
				376	<a name="RegisterSet">Register Set and Register Classes</a>
				377	</div>
				378	<!-- *********************************************************************** -->
				379	<div class="doc_text">
				380	<p>You should describe
				381	a concrete target-specific class
				382	that represents the register file of a target machine. This class is
				383	called XXXRegisterInfo (where XXX identifies the target) and represents the
				384	class register file data that is used for register allocation and also
				385	describes the interactions between registers. </p>
				386
				387	<p>You also need to
				388	define register classes to categorize related registers. A register class
				389	should be added for groups of registers that are all treated the same way for
				390	some instruction. Typical examples are register classes that include integer,
				391	floating-point, or vector registers. A register allocator allows an
				392	instruction to use any register in a specified register class to perform the
				393	instruction in a similar manner. Register classes allocate virtual registers to
				394	instructions from these sets, and register classes let the target-independent
				395	register allocator automatically choose the actual registers.</p>
				396
				397	<p>Much of the code for registers, including register definition,
				398	register aliases, and register classes, is generated by TableGen from
				399	<tt>XXXRegisterInfo.td</tt> input files and placed in <tt>XXXGenRegisterInfo.h.inc</tt> and
				400	<tt>XXXGenRegisterInfo.inc</tt> output files. Some of the code in the implementation of
				401	XXXRegisterInfo requires hand-coding. </p>
				402	</div>
				403
				404	<!-- ======================================================================= -->
				405	<div class="doc_subsection">
				406	<a name="RegisterDef">Defining a Register</a>
				407	</div>
				408	<div class="doc_text">
				409	<p>The <tt>XXXRegisterInfo.td</tt> file typically starts with register definitions
				410	for a target machine. The Register class (specified in <tt>Target.td</tt>) is used to
				411	define an object for each register. The specified string n becomes the Name of
				412	the register. The basic Register object does not have any subregisters and does
				413	not specify any aliases.</p>
				414	</div>
				415	<div class="doc_code">
				416	<pre>
				417	class Register<string n> {
				418	string Namespace = "";
				419	string AsmName = n;
				420	string Name = n;
				421	int SpillSize = 0;
				422	int SpillAlignment = 0;
				423	list<Register> Aliases = [];
				424	list<Register> SubRegs = [];
				425	list<int> DwarfNumbers = [];
				426	}
				427	</pre>
				428	</div>
				429
				430	<div class="doc_text">
				431	<p>For example, in the <tt>X86RegisterInfo.td</tt> file, there are register
				432	definitions that utilize the Register class, such as:</p>
				433	</div>
				434	<div class="doc_code">
				435	<pre>
				436	def AL : Register<"AL">,
				437	DwarfRegNum<[0, 0, 0]>;
				438	</pre>
				439	</div>
				440
				441	<div class="doc_text">
				442	<p>This defines the register AL and assigns it values (with
				443	DwarfRegNum) that are used by <tt>gcc</tt>, <tt>gdb</tt>, or a debug information writer (such as
				444	DwarfWriter in <tt>llvm/lib/CodeGen</tt>) to identify a register. For register AL,
				445	DwarfRegNum takes an array of 3 values, representing 3 different modes: the
				446	first element is for X86-64, the second for EH (exception handling) on X86-32,
				447	and the third is generic. -1 is a special Dwarf number that indicates the gcc
				448	number is undefined, and -2 indicates the register number is invalid for this
				449	mode.</p>
				450
				451	<p>From the previously described line in the <tt>X86RegisterInfo.td</tt>
				452	file, TableGen generates this code in the <tt>X86GenRegisterInfo.inc</tt> file:</p>
				453	</div>
				454	<div class="doc_code">
				455	<pre>
				456	static const unsigned GR8[] = { X86::AL, ... };
				457
				458	const unsigned AL_AliasSet[] = { X86::AX, X86::EAX, X86::RAX, 0 };
				459
				460	const TargetRegisterDesc RegisterDescriptors[] = {
				461	...
				462	{ "AL", "AL", AL_AliasSet, Empty_SubRegsSet, Empty_SubRegsSet, AL_SuperRegsSet }, ...
				463	</pre>
				464	</div>
				465
				466	<div class="doc_text">
				467	<p>From the register info file, TableGen generates a
				468	TargetRegisterDesc object for each register. TargetRegisterDesc is defined in
				469	<tt>include/llvm/Target/TargetRegisterInfo.h</tt> with the following fields:</p>
				470	</div>
				471
				472	<div class="doc_code">
				473	<pre>
				474	struct TargetRegisterDesc {
				475	const char *AsmName; // Assembly language name for the register
				476	const char *Name; // Printable name for the reg (for debugging)
				477	const unsigned *AliasSet; // Register Alias Set
				478	const unsigned *SubRegs; // Sub-register set
				479	const unsigned *ImmSubRegs; // Immediate sub-register set
				480	const unsigned *SuperRegs; // Super-register set
				481	};</pre>
				482	</div>
				483
				484	<div class="doc_text">
				485	<p>TableGen uses the entire target description file (<tt>.td</tt>) to
				486	determine text names for the register (in the AsmName and Name fields of
				487	TargetRegisterDesc) and the relationships of other registers to the defined
				488	register (in the other TargetRegisterDesc fields). In this example, other
				489	definitions establish the registers "AX", "EAX", and "RAX" as aliases for one
				490	another, so TableGen generates a null-terminated array (AL_AliasSet) for this
				491	register alias set. </p>
				492
				493	<p>The Register class is commonly used as a base class for more
				494	complex classes. In <tt>Target.td</tt>, the Register class is the base for the
				495	RegisterWithSubRegs class that is used to define registers that need to specify
				496	subregisters in the SubRegs list, as shown here:</p>
				497	</div>
				498	<div class="doc_code">
				499	<pre>
				500	class RegisterWithSubRegs<string n,
				501	list<Register> subregs> : Register<n> {
				502	let SubRegs = subregs;
				503	}</pre>
				504	</div>
				505
				506	<div class="doc_text">
				507	<p>In <tt>SparcRegisterInfo.td</tt>, additional register classes are defined
				508	for SPARC: a Register subclass, SparcReg, and further subclasses: Ri, Rf, and
				509	Rd. SPARC registers are identified by 5-bit ID numbers, which is a feature
				510	common to these subclasses. Note the use of ‘let’ expressions to override values
				511	that are initially defined in a superclass (such as SubRegs field in the Rd
				512	class). </p>
				513	</div>
				514	<div class="doc_code">
				515	<pre>
				516	class SparcReg<string n> : Register<n> {
				517	field bits<5> Num;
				518	let Namespace = "SP";
				519	}
				520	// Ri - 32-bit integer registers
				521	class Ri<bits<5> num, string n> :
				522	SparcReg<n> {
				523	let Num = num;
				524	}
				525	// Rf - 32-bit floating-point registers
				526	class Rf<bits<5> num, string n> :
				527	SparcReg<n> {
				528	let Num = num;
				529	}
				530	// Rd - Slots in the FP register file for 64-bit
				531	floating-point values.
				532	class Rd<bits<5> num, string n,
				533	list<Register> subregs> : SparcReg<n> {
				534	let Num = num;
				535	let SubRegs = subregs;
				536	}</pre>
				537	</div>
				538	<div class="doc_text">
				539	<p>In the <tt>SparcRegisterInfo.td</tt> file, there are register definitions
				540	that utilize these subclasses of Register, such as:</p>
				541	</div>
				542	<div class="doc_code">
				543	<pre>
				544	def G0 : Ri< 0, "G0">,
				545	DwarfRegNum<[0]>;
				546	def G1 : Ri< 1, "G1">, DwarfRegNum<[1]>;
				547	...
				548	def F0 : Rf< 0, "F0">,
				549	DwarfRegNum<[32]>;
				550	def F1 : Rf< 1, "F1">,
				551	DwarfRegNum<[33]>;
				552	...
				553	def D0 : Rd< 0, "F0", [F0, F1]>,
				554	DwarfRegNum<[32]>;
				555	def D1 : Rd< 2, "F2", [F2, F3]>,
				556	DwarfRegNum<[34]>;
				557	</pre>
				558	</div>
				559	<div class="doc_text">
				560	<p>The last two registers shown above (D0 and D1) are double-precision
				561	floating-point registers that are aliases for pairs of single-precision
				562	floating-point sub-registers. In addition to aliases, the sub-register and
				563	super-register relationships of the defined register are in fields of a
				564	register’s TargetRegisterDesc.</p>
				565	</div>
				566
				567	<!-- ======================================================================= -->
				568	<div class="doc_subsection">
				569	<a name="RegisterClassDef">Defining a Register Class</a>
				570	</div>
				571	<div class="doc_text">
				572	<p>The RegisterClass class (specified in <tt>Target.td</tt>) is used to
				573	define an object that represents a group of related registers and also defines
				574	the default allocation order of the registers. A target description file
				575	<tt>XXXRegisterInfo.td</tt> that uses <tt>Target.td</tt> can construct register classes using the
				576	following class:</p>
				577	</div>
				578
				579	<div class="doc_code">
				580	<pre>
				581	class RegisterClass<string namespace,
				582	list<ValueType> regTypes, int alignment,
				583	list<Register> regList> {
				584	string Namespace = namespace;
				585	list<ValueType> RegTypes = regTypes;
				586	int Size = 0; // spill size, in bits; zero lets tblgen pick the size
				587	int Alignment = alignment;
				588
				589	// CopyCost is the cost of copying a value between two registers
				590	// default value 1 means a single instruction
				591	// A negative value means copying is extremely expensive or impossible
				592	int CopyCost = 1;
				593	list<Register> MemberList = regList;
				594
				595	// for register classes that are subregisters of this class
				596	list<RegisterClass> SubRegClassList = [];
				597
				598	code MethodProtos = [{}]; // to insert arbitrary code
				599	code MethodBodies = [{}];
				600	}</pre>
				601	</div>
				602	<div class="doc_text">
				603	<p>To define a RegisterClass, use the following 4 arguments:</p>
				604	<ul>
				605	<li>The first argument of the definition is the name of the
				606	namespace. </li>
				607
				608	<li>The second argument is a list of ValueType register type values
				609	that are defined in <tt>include/llvm/CodeGen/ValueTypes.td</tt>. Defined values include
				610	integer types (such as i16, i32, and i1 for Boolean), floating-point types
				611	(f32, f64), and vector types (for example, v8i16 for an 8 x i16 vector). All
				612	registers in a RegisterClass must have the same ValueType, but some registers
				613	may store vector data in different configurations. For example a register that
				614	can process a 128-bit vector may be able to handle 16 8-bit integer elements, 8
				615	16-bit integers, 4 32-bit integers, and so on. </li>
				616
				617	<li>The third argument of the RegisterClass definition specifies the
				618	alignment required of the registers when they are stored or loaded to memory.</li>
				619
				620	<li>The final argument, <tt>regList</tt>, specifies which registers are in
				621	this class. If an <tt>allocation_order_*</tt> method is not specified, then <tt>regList</tt> also
				622	defines the order of allocation used by the register allocator.</li>
				623	</ul>
				624
				625	<p>In <tt>SparcRegisterInfo.td</tt>, three RegisterClass objects are defined:
				626	FPRegs, DFPRegs, and IntRegs. For all three register classes, the first
				627	argument defines the namespace with the string “SP”. FPRegs defines a group of 32
				628	single-precision floating-point registers (F0 to F31); DFPRegs defines a group
				629	of 16 double-precision registers (D0-D15). For IntRegs, the MethodProtos and
				630	MethodBodies methods are used by TableGen to insert the specified code into generated
				631	output.</p>
				632	</div>
				633	<div class="doc_code">
				634	<pre>
				635	def FPRegs : RegisterClass<"SP", [f32], 32, [F0, F1, F2, F3, F4, F5, F6, F7,
				636	F8, F9, F10, F11, F12, F13, F14, F15, F16, F17, F18, F19, F20, F21, F22,
				637	F23, F24, F25, F26, F27, F28, F29, F30, F31]>;
				638
				639	def DFPRegs : RegisterClass<"SP", [f64], 64, [D0, D1, D2, D3, D4, D5, D6, D7,
				640	D8, D9, D10, D11, D12, D13, D14, D15]>;
				641
				642	def IntRegs : RegisterClass<"SP", [i32], 32, [L0, L1, L2, L3, L4, L5, L6, L7,
				643	I0, I1, I2, I3, I4, I5,
				644	O0, O1, O2, O3, O4, O5, O7,
				645	G1,
				646	// Non-allocatable regs:
				647	G2, G3, G4,
				648	O6, // stack ptr
				649	I6, // frame ptr
				650	I7, // return address
				651	G0, // constant zero
				652	G5, G6, G7 // reserved for kernel
				653	]> {
				654	let MethodProtos = [{
				655	iterator allocation_order_end(const MachineFunction &MF) const;
				656	}];
				657	let MethodBodies = [{
				658	IntRegsClass::iterator
				659	IntRegsClass::allocation_order_end(const MachineFunction &MF) const {
				660	return end()-10 // Don't allocate special registers
				661	-1;
				662	}
				663	}];
				664	}
				665	</pre>
				666	</div>
				667
				668	<div class="doc_text">
				669	<p>Using <tt>SparcRegisterInfo.td</tt> with TableGen generates several output
				670	files that are intended for inclusion in other source code that you write.
				671	<tt>SparcRegisterInfo.td</tt> generates <tt>SparcGenRegisterInfo.h.inc</tt>, which should be
				672	included in the header file for the implementation of the SPARC register
				673	implementation that you write (<tt>SparcRegisterInfo.h</tt>). In
				674	<tt>SparcGenRegisterInfo.h.inc</tt> a new structure is defined called
				675	SparcGenRegisterInfo that uses TargetRegisterInfo as its base. It also
				676	specifies types, based upon the defined register classes: DFPRegsClass, FPRegsClass,
				677	and IntRegsClass. </p>
				678
				679	<p><tt>SparcRegisterInfo.td</tt> also generates SparcGenRegisterInfo.inc,
				680	which is included at the bottom of <tt>SparcRegisterInfo.cpp</tt>, the SPARC register
				681	implementation. The code below shows only the generated integer registers and
				682	associated register classes. The order of registers in IntRegs reflects the
				683	order in the definition of IntRegs in the target description file. Take special
				684	note of the use of MethodBodies in <tt>SparcRegisterInfo.td</tt> to create code in
				685	<tt>SparcGenRegisterInfo.inc</tt>. MethodProtos generates similar code in
				686	<tt>SparcGenRegisterInfo.h.inc</tt>.</p>
				687	</div>
				688
				689	<div class="doc_code">
				690	<pre> // IntRegs Register Class...
				691	static const unsigned IntRegs[] = {
				692	SP::L0, SP::L1, SP::L2, SP::L3, SP::L4, SP::L5,
				693	SP::L6, SP::L7, SP::I0, SP::I1, SP::I2, SP::I3, SP::I4, SP::I5, SP::O0, SP::O1,
				694	SP::O2, SP::O3, SP::O4, SP::O5, SP::O7, SP::G1, SP::G2, SP::G3, SP::G4, SP::O6,
				695	SP::I6, SP::I7, SP::G0, SP::G5, SP::G6, SP::G7,
				696	};
				697
				698	// IntRegsVTs Register Class Value Types...
				699	static const MVT::ValueType IntRegsVTs[] = {
				700	MVT::i32, MVT::Other
				701	};
				702	namespace SP { // Register class instances
				703	DFPRegsClass    DFPRegsRegClass;
				704	FPRegsClass     FPRegsRegClass;
				705	IntRegsClass    IntRegsRegClass;
				706	...
				707
				708	// IntRegs Sub-register Classess...
				709	static const TargetRegisterClass* const IntRegsSubRegClasses [] = {
				710	NULL
				711	};
				712	...
				713	// IntRegs Super-register Classess...
				714	static const TargetRegisterClass* const IntRegsSuperRegClasses [] = {
				715	NULL
				716	};
				717
				718	// IntRegs Register Class sub-classes...
				719	static const TargetRegisterClass* const IntRegsSubclasses [] = {
				720	NULL
				721	};
				722	...
				723
				724	// IntRegs Register Class super-classes...
				725	static const TargetRegisterClass* const IntRegsSuperclasses [] = {
				726	NULL
				727	};
				728	...
				729
				730	IntRegsClass::iterator
				731	IntRegsClass::allocation_order_end(const MachineFunction &MF) const {
				732
				733	return end()-10 // Don't allocate special registers
				734	-1;
				735	}
				736
				737	IntRegsClass::IntRegsClass() : TargetRegisterClass(IntRegsRegClassID,
				738	IntRegsVTs, IntRegsSubclasses, IntRegsSuperclasses, IntRegsSubRegClasses,
				739	IntRegsSuperRegClasses, 4, 4, 1, IntRegs, IntRegs + 32) {}
				740	}
				741	</pre>
				742	</div>
				743	<!-- ======================================================================= -->
				744	<div class="doc_subsection">
				745	<a name="implementRegister">Implement a subclass of
				746	<a href="http://www.llvm.org/docs/CodeGenerator.html#targetregisterinfo">TargetRegisterInfo</a></a>
				747	</div>
				748	<div class="doc_text">
				749	<p>The final step is to hand code portions of XXXRegisterInfo, which
				750	implements the interface described in <tt>TargetRegisterInfo.h</tt>. These functions
				751	return 0, NULL, or false, unless overridden. Here’s a list of functions that
				752	are overridden for the SPARC implementation in <tt>SparcRegisterInfo.cpp</tt>:</p>
				753	<ul>
				754	<li><tt>getCalleeSavedRegs</tt> (returns a list of callee-saved registers in
				755	the order of the desired callee-save stack frame offset)</li>
				756
				757	<li><tt>getCalleeSavedRegClasses</tt> (returns a list of preferred register
				758	classes with which to spill each callee saved register)</li>
				759
				760	<li><tt>getReservedRegs</tt> (returns a bitset indexed by physical register
				761	numbers, indicating if a particular register is unavailable)</li>
				762
				763	<li><tt>hasFP</tt> (return a Boolean indicating if a function should have a
				764	dedicated frame pointer register)</li>
				765
				766	<li><tt>eliminateCallFramePseudoInstr</tt> (if call frame setup or destroy
				767	pseudo instructions are used, this can be called to eliminate them)</li>
				768
				769	<li><tt>eliminateFrameIndex</tt> (eliminate abstract frame indices from
				770	instructions that may use them)</li>
				771
				772	<li><tt>emitPrologue</tt> (insert prologue code into the function)</li>
				773
				774	<li><tt>emitEpilogue</tt> (insert epilogue code into the function)</li>
				775	</ul>
				776	</div>
				777
				778	<!-- *********************************************************************** -->
				779	<div class="doc_section">
				780	<a name="InstructionSet">Instruction Set</a>
				781	</div>
				782	<!-- *********************************************************************** -->
				783	<div class="doc_text">
				784	<p>During the early stages of code generation, the LLVM IR code is
				785	converted to a SelectionDAG with nodes that are instances of the SDNode class
				786	containing target instructions. An SDNode has an opcode, operands, type
				787	requirements, and operation properties (for example, is an operation
				788	commutative, does an operation load from memory). The various operation node
				789	types are described in the <tt>include/llvm/CodeGen/SelectionDAGNodes.h</tt> file (values
				790	of the NodeType enum in the ISD namespace).</p>
				791
				792	<p>TableGen uses the following target description (.td) input files
				793	to generate much of the code for instruction definition:</p>
				794	<ul>
				795	<li><tt>Target.td</tt>, where the Instruction, Operand, InstrInfo, and other
				796	fundamental classes are defined</li>
				797
				798	<li><tt>TargetSelectionDAG.td</tt>, used by SelectionDAG instruction selection
				799	generators, contains SDTC* classes (selection DAG type constraint), definitions
				800	of SelectionDAG nodes (such as imm, cond, bb, add, fadd, sub), and pattern
				801	support (Pattern, Pat, PatFrag, PatLeaf, ComplexPattern)</li>
				802
				803	<li><tt>XXXInstrFormats.td</tt>, patterns for definitions of target-specific
				804	instructions</li>
				805
				806	<li><tt>XXXInstrInfo.td</tt>, target-specific definitions of instruction
				807	templates, condition codes, and instructions of an instruction set. (For architecture
				808	modifications, a different file name may be used. For example, for Pentium with
				809	SSE instruction, this file is <tt>X86InstrSSE.td</tt>, and for Pentium with MMX, this
				810	file is <tt>X86InstrMMX.td</tt>.)</li>
				811	</ul>
				812	<p>There is also a target-specific <tt>XXX.td</tt> file, where XXX is the
				813	name of the target. The <tt>XXX.td</tt> file includes the other .td input files, but its
				814	contents are only directly important for subtargets.</p>
				815
				816	<p>You should describe
				817	a concrete target-specific class
				818	XXXInstrInfo that represents machine
				819	instructions supported by a target machine. XXXInstrInfo contains an array of
				820	XXXInstrDescriptor objects, each of which describes one instruction. An
				821	instruction descriptor defines:</p>
				822	<ul>
				823	<li>opcode mnemonic</li>
				824
				825	<li>number of operands</li>
				826
				827	<li>list of implicit register definitions and uses</li>
				828
				829	<li>target-independent properties (such as memory access, is
				830	commutable)</li>
				831
				832	<li>target-specific flags </li>
				833	</ul>
				834
				835	<p>The Instruction class (defined in <tt>Target.td</tt>) is mostly used as a
				836	base for more complex instruction classes.</p>
				837	</div>
				838
				839	<div class="doc_code">
				840	<pre>class Instruction {
				841	string Namespace = "";
				842	dag OutOperandList; // An dag containing the MI def operand list.
				843	dag InOperandList; // An dag containing the MI use operand list.
				844	string AsmString = ""; // The .s format to print the instruction with.
				845	list<dag> Pattern; // Set to the DAG pattern for this instruction
				846	list<Register> Uses = [];
				847	list<Register> Defs = [];
				848	list<Predicate> Predicates = []; // predicates turned into isel match code
				849	... remainder not shown for space ...
				850	}
				851	</pre>
				852	</div>
				853	<div class="doc_text">
				854	<p>A SelectionDAG node (SDNode) should contain an object
				855	representing a target-specific instruction that is defined in <tt>XXXInstrInfo.td</tt>. The
				856	instruction objects should represent instructions from the architecture manual
				857	of the target machine (such as the
				858	SPARC Architecture Manual for the SPARC target). </p>
				859
				860	<p>A single
				861	instruction from the architecture manual is often modeled as multiple target
				862	instructions, depending upon its operands.  For example, a manual might
				863	describe an add instruction that takes a register or an immediate operand. An
				864	LLVM target could model this with two instructions named ADDri and ADDrr.</p>
				865
				866	<p>You should define a
				867	class for each instruction category and define each opcode as a subclass of the
				868	category with appropriate parameters such as the fixed binary encoding of
				869	opcodes and extended opcodes. You should map the register bits to the bits of
				870	the instruction in which they are encoded (for the JIT). Also you should specify
				871	how the instruction should be printed when the automatic assembly printer is
				872	used.</p>
				873
				874	<p>As is described in
				875	the SPARC Architecture Manual, Version 8, there are three major 32-bit formats
				876	for instructions. Format 1 is only for the CALL instruction. Format 2 is for
				877	branch on condition codes and SETHI (set high bits of a register) instructions.
				878	Format 3 is for other instructions. </p>
				879
				880	<p>Each of these
				881	formats has corresponding classes in <tt>SparcInstrFormat.td</tt>. InstSP is a base
				882	class for other instruction classes. Additional base classes are specified for
				883	more precise formats: for example in <tt>SparcInstrFormat.td</tt>, F2_1 is for SETHI,
				884	and F2_2 is for branches. There are three other base classes: F3_1 for
				885	register/register operations, F3_2 for register/immediate operations, and F3_3 for
				886	floating-point operations. <tt>SparcInstrInfo.td</tt> also adds the base class Pseudo for
				887	synthetic SPARC instructions. </p>
				888
				889	<p><tt>SparcInstrInfo.td</tt>
				890	largely consists of operand and instruction definitions for the SPARC target. In
				891	<tt>SparcInstrInfo.td</tt>, the following target description file entry, LDrr, defines
				892	the Load Integer instruction for a Word (the LD SPARC opcode) from a memory
				893	address to a register. The first parameter, the value 3 (11<sub>2</sub>), is
				894	the operation value for this category of operation. The second parameter
				895	(000000<sub>2</sub>) is the specific operation value for LD/Load Word. The
				896	third parameter is the output destination, which is a register operand and
				897	defined in the Register target description file (IntRegs). </p>
				898	</div>
				899	<div class="doc_code">
				900	<pre>def LDrr : F3_1 <3, 0b000000, (outs IntRegs:$dst), (ins MEMrr:$addr),
				901	"ld [$addr], $dst",
				902	[(set IntRegs:$dst, (load ADDRrr:$addr))]>;
				903	</pre>
				904	</div>
				905
				906	<div class="doc_text">
				907	<p>The fourth
				908	parameter is the input source, which uses the address operand MEMrr that is
				909	defined earlier in <tt>SparcInstrInfo.td</tt>:</p>
				910	</div>
				911	<div class="doc_code">
				912	<pre>def MEMrr : Operand<i32> {
				913	let PrintMethod = "printMemOperand";
				914	let MIOperandInfo = (ops IntRegs, IntRegs);
				915	}
				916	</pre>
				917	</div>
				918	<div class="doc_text">
				919	<p>The fifth parameter is a string that is used by the assembly
				920	printer and can be left as an empty string until the assembly printer interface
				921	is implemented. The sixth and final parameter is the pattern used to match the
				922	instruction during the SelectionDAG Select Phase described in
				923	(<a href="http://www.llvm.org/docs/CodeGenerator.html">The LLVM Target-Independent Code Generator</a>).
				924	This parameter is detailed in the next section, <a href="#InstructionSelector">Instruction Selector</a>.</p>
				925
				926	<p>Instruction class definitions are not overloaded for different
				927	operand types, so separate versions of instructions are needed for register,
				928	memory, or immediate value operands. For example, to perform a
				929	Load Integer instruction for a Word
				930	from an immediate operand to a register, the following instruction class is
				931	defined: </p>
				932	</div>
				933	<div class="doc_code">
				934	<pre>def LDri : F3_2 <3, 0b000000, (outs IntRegs:$dst), (ins MEMri:$addr),
				935	"ld [$addr], $dst",
				936	[(set IntRegs:$dst, (load ADDRri:$addr))]>;
				937	</pre>
				938	</div>
				939	<div class="doc_text">
				940	<p>Writing these definitions for so many similar instructions can
				941	involve a lot of cut and paste. In td files, the <tt>multiclass</tt> directive enables
				942	the creation of templates to define several instruction classes at once (using
				943	the <tt>defm</tt> directive). For example in
				944	<tt>SparcInstrInfo.td</tt>, the <tt>multiclass</tt> pattern F3_12 is defined to create 2
				945	instruction classes each time F3_12 is invoked: </p>
				946	</div>
				947	<div class="doc_code">
				948	<pre>multiclass F3_12 <string OpcStr, bits<6> Op3Val, SDNode OpNode> {
				949	def rr : F3_1 <2, Op3Val,
				950	(outs IntRegs:$dst), (ins IntRegs:$b, IntRegs:$c),
				951	!strconcat(OpcStr, " $b, $c, $dst"),
				952	[(set IntRegs:$dst, (OpNode IntRegs:$b, IntRegs:$c))]>;
				953	def ri : F3_2 <2, Op3Val,
				954	(outs IntRegs:$dst), (ins IntRegs:$b, i32imm:$c),
				955	!strconcat(OpcStr, " $b, $c, $dst"),
				956	[(set IntRegs:$dst, (OpNode IntRegs:$b, simm13:$c))]>;
				957	}
				958	</pre>
				959	</div>
				960	<div class="doc_text">
				961	<p>So when the <tt>defm</tt> directive is used for the XOR and ADD
				962	instructions, as seen below, it creates four instruction objects: XORrr, XORri,
				963	ADDrr, and ADDri.</p>
				964	</div>
				965	<div class="doc_code">
				966	<pre>defm XOR : F3_12<"xor", 0b000011, xor>;
				967	defm ADD : F3_12<"add", 0b000000, add>;
				968	</pre>
				969	</div>
				970
				971	<div class="doc_text">
				972	<p><tt>SparcInstrInfo.td</tt>
				973	also includes definitions for condition codes that are referenced by branch
				974	instructions. The following definitions in <tt>SparcInstrInfo.td</tt> indicate the bit location
				975	of the SPARC condition code; for example, the 10<sup>th</sup> bit represents
				976	the ‘greater than’ condition for integers, and the 22<sup>nd</sup> bit
				977	represents the ‘greater than’ condition for floats. </p>
				978	</div>
				979
				980	<div class="doc_code">
				981	<pre>def ICC_NE : ICC_VAL< 9>; // Not Equal
				982	def ICC_E : ICC_VAL< 1>; // Equal
				983	def ICC_G : ICC_VAL<10>; // Greater
				984	...
				985	def FCC_U : FCC_VAL<23>; // Unordered
				986	def FCC_G : FCC_VAL<22>; // Greater
				987	def FCC_UG : FCC_VAL<21>; // Unordered or Greater
				988	...
				989	</pre>
				990	</div>
				991
				992	<div class="doc_text">
				993	<p>(Note that <tt>Sparc.h</tt>
				994	also defines enums that correspond to the same SPARC condition codes. Care must
				995	be taken to ensure the values in <tt>Sparc.h</tt> correspond to the values in
				996	<tt>SparcInstrInfo.td</tt>; that is, <tt>SPCC::ICC_NE = 9</tt>, <tt>SPCC::FCC_U = 23</tt> and so on.)</p>
				997	</div>
				998
				999	<!-- ======================================================================= -->
				1000	<div class="doc_subsection">
				1001	<a name="implementInstr">Implement a subclass of
				1002	<a href="http://www.llvm.org/docs/CodeGenerator.html#targetinstrinfo">TargetInstrInfo</a></a>
				1003	</div>
				1004
				1005	<div class="doc_text">
				1006	<p>The final step is to hand code portions of XXXInstrInfo, which
				1007	implements the interface described in <tt>TargetInstrInfo.h</tt>. These functions return
				1008	0 or a Boolean or they assert, unless overridden. Here's a list of functions
				1009	that are overridden for the SPARC implementation in <tt>SparcInstrInfo.cpp</tt>:</p>
				1010	<ul>
				1011	<li><tt>isMoveInstr</tt> (return true if the instruction is a register to
				1012	register move; false, otherwise)</li>
				1013
				1014	<li><tt>isLoadFromStackSlot</tt> (if the specified machine instruction is a
				1015	direct load from a stack slot, return the register number of the destination
				1016	and the FrameIndex of the stack slot)</li>
				1017
				1018	<li><tt>isStoreToStackSlot</tt> (if the specified machine instruction is a
				1019	direct store to a stack slot, return the register number of the destination and
				1020	the FrameIndex of the stack slot)</li>
				1021
				1022	<li><tt>copyRegToReg</tt> (copy values between a pair of registers)</li>
				1023
				1024	<li><tt>storeRegToStackSlot</tt> (store a register value to a stack slot)</li>
				1025
				1026	<li><tt>loadRegFromStackSlot</tt> (load a register value from a stack slot)</li>
				1027
				1028	<li><tt>storeRegToAddr</tt> (store a register value to memory)</li>
				1029
				1030	<li><tt>loadRegFromAddr</tt> (load a register value from memory)</li>
				1031
				1032	<li><tt>foldMemoryOperand</tt> (attempt to combine instructions of any load or
				1033	store instruction for the specified operand(s))</li>
				1034	</ul>
				1035	</div>
				1036
				1037	<!-- ======================================================================= -->
				1038	<div class="doc_subsection">
				1039	<a name="branchFolding">Branch Folding and If Conversion</a>
				1040	</div>
				1041	<div class="doc_text">
				1042	<p>Performance can be improved by combining instructions or by eliminating
				1043	instructions that are never reached. The <tt>AnalyzeBranch</tt> method in XXXInstrInfo may
				1044	be implemented to examine conditional instructions and remove unnecessary
				1045	instructions. <tt>AnalyzeBranch</tt> looks at the end of a machine basic block (MBB) for
				1046	opportunities for improvement, such as branch folding and if conversion. The
				1047	<tt>BranchFolder</tt> and <tt>IfConverter</tt> machine function passes (see the source files
				1048	<tt>BranchFolding.cpp</tt> and <tt>IfConversion.cpp</tt> in the <tt>lib/CodeGen</tt> directory) call
				1049	<tt>AnalyzeBranch</tt> to improve the control flow graph that represents the
				1050	instructions. </p>
				1051
				1052	<p>Several implementations of <tt>AnalyzeBranch</tt> (for ARM, Alpha, and
				1053	X86) can be examined as models for your own <tt>AnalyzeBranch</tt> implementation. Since
				1054	SPARC does not implement a useful <tt>AnalyzeBranch</tt>, the ARM target implementation
				1055	is shown below.</p>
				1056
				1057	<p><tt>AnalyzeBranch</tt> returns a Boolean value and takes four parameters:</p>
				1058	<ul>
				1059	<li>MachineBasicBlock &MBB – the incoming block to be
				1060	examined</li>
				1061
				1062	<li>MachineBasicBlock *&TBB – a destination block that is
				1063	returned; for a conditional branch that evaluates to true, TBB is the
				1064	destination </li>
				1065
				1066	<li>MachineBasicBlock *&FBB – for a conditional branch that
				1067	evaluates to false, FBB is returned as the destination</li>
				1068
				1069	<li>std::vector<MachineOperand> &Cond – list of
				1070	operands to evaluate a condition for a conditional branch</li>
				1071	</ul>
				1072
				1073	<p>In the simplest case, if a block ends without a branch, then it
				1074	falls through to the successor block. No destination blocks are specified for
				1075	either TBB or FBB, so both parameters return NULL. The start of the <tt>AnalyzeBranch</tt>
				1076	(see code below for the ARM target) shows the function parameters and the code
				1077	for the simplest case.</p>
				1078	</div>
				1079
				1080	<div class="doc_code">
				1081	<pre>bool ARMInstrInfo::AnalyzeBranch(MachineBasicBlock &MBB,
				1082	MachineBasicBlock &TBB, MachineBasicBlock &FBB,
				1083	std::vector<MachineOperand> &Cond) const
				1084	{
				1085	MachineBasicBlock::iterator I = MBB.end();
				1086	if (I == MBB.begin() \|\| !isUnpredicatedTerminator(--I))
				1087	return false;
				1088	</pre>
				1089	</div>
				1090
				1091	<div class="doc_text">
				1092	<p>If a block ends with a single unconditional branch instruction,
				1093	then <tt>AnalyzeBranch</tt> (shown below) should return the destination of that branch
				1094	in the TBB parameter. </p>
				1095	</div>
				1096
				1097	<div class="doc_code">
				1098	<pre>if (LastOpc == ARM::B \|\| LastOpc == ARM::tB) {
				1099	TBB = LastInst->getOperand(0).getMBB();
				1100	return false;
				1101	}
				1102	</pre>
				1103	</div>
				1104
				1105	<div class="doc_text">
				1106	<p>If a block ends with two unconditional branches, then the second
				1107	branch is never reached. In that situation, as shown below, remove the last
				1108	branch instruction and return the penultimate branch in the TBB parameter. </p>
				1109	</div>
				1110
				1111	<div class="doc_code">
				1112	<pre>if ((SecondLastOpc == ARM::B \|\| SecondLastOpc==ARM::tB) &&
				1113	(LastOpc == ARM::B \|\| LastOpc == ARM::tB)) {
				1114	TBB = SecondLastInst->getOperand(0).getMBB();
				1115	I = LastInst;
				1116	I->eraseFromParent();
				1117	return false;
				1118	}
				1119	</pre>
				1120	</div>
				1121	<div class="doc_text">
				1122	<p>A block may end with a single conditional branch instruction that
				1123	falls through to successor block if the condition evaluates to false. In that
				1124	case, <tt>AnalyzeBranch</tt> (shown below) should return the destination of that
				1125	conditional branch in the TBB parameter and a list of operands in the <tt>Cond</tt>
				1126	parameter to evaluate the condition. </p>
				1127	</div>
				1128
				1129	<div class="doc_code">
				1130	<pre>if (LastOpc == ARM::Bcc \|\| LastOpc == ARM::tBcc) {
				1131	// Block ends with fall-through condbranch.
				1132	TBB = LastInst->getOperand(0).getMBB();
				1133	Cond.push_back(LastInst->getOperand(1));
				1134	Cond.push_back(LastInst->getOperand(2));
				1135	return false;
				1136	}
				1137	</pre>
				1138	</div>
				1139
				1140	<div class="doc_text">
				1141	<p>If a block ends with both a conditional branch and an ensuing
				1142	unconditional branch, then <tt>AnalyzeBranch</tt> (shown below) should return the
				1143	conditional branch destination (assuming it corresponds to a conditional
				1144	evaluation of ‘true’) in the TBB parameter and the unconditional branch
				1145	destination in the FBB (corresponding to a conditional evaluation of ‘false’).
				1146	A list of operands to evaluate the condition should be returned in the <tt>Cond</tt>
				1147	parameter.</p>
				1148	</div>
				1149
				1150	<div class="doc_code">
				1151	<pre>unsigned SecondLastOpc = SecondLastInst->getOpcode();
				1152	if ((SecondLastOpc == ARM::Bcc && LastOpc == ARM::B) \|\|
				1153	(SecondLastOpc == ARM::tBcc && LastOpc == ARM::tB)) {
				1154	TBB = SecondLastInst->getOperand(0).getMBB();
				1155	Cond.push_back(SecondLastInst->getOperand(1));
				1156	Cond.push_back(SecondLastInst->getOperand(2));
				1157	FBB = LastInst->getOperand(0).getMBB();
				1158	return false;
				1159	}
				1160	</pre>
				1161	</div>
				1162
				1163	<div class="doc_text">
				1164	<p>For the last two cases (ending with a single conditional branch or
				1165	ending with one conditional and one unconditional branch), the operands returned
				1166	in the <tt>Cond</tt> parameter can be passed to methods of other instructions to create
				1167	new branches or perform other operations. An implementation of <tt>AnalyzeBranch</tt>
				1168	requires the helper methods <tt>RemoveBranch</tt> and <tt>InsertBranch</tt> to manage subsequent
				1169	operations.</p>
				1170
				1171	<p><tt>AnalyzeBranch</tt> should return false indicating success in most circumstances.
				1172	<tt>AnalyzeBranch</tt> should only return true when the method is stumped about what to
				1173	do, for example, if a block has three terminating branches. <tt>AnalyzeBranch</tt> may
				1174	return true if it encounters a terminator it cannot handle, such as an indirect
				1175	branch.</p>
				1176	</div>
				1177
				1178	<!-- *********************************************************************** -->
				1179	<div class="doc_section">
				1180	<a name="InstructionSelector">Instruction Selector</a>
Misha Brukman	8eb6719	2004-09-06 22:58:13 +0000	[diff] [blame]	1181	</div>
				1182	<!-- *********************************************************************** -->
				1183
				1184	<div class="doc_text">
Chris Lattner	7897538	2008-11-11 19:30:41 +0000	[diff] [blame^]	1185	<p>LLVM uses a SelectionDAG to represent LLVM IR instructions, and nodes
				1186	of the SelectionDAG ideally represent native target instructions. During code
				1187	generation, instruction selection passes are performed to convert non-native
				1188	DAG instructions into native target-specific instructions. The pass described
				1189	in <tt>XXXISelDAGToDAG.cpp</tt> is used to match patterns and perform DAG-to-DAG
				1190	instruction selection. Optionally, a pass may be defined (in
				1191	<tt>XXXBranchSelector.cpp</tt>) to perform similar DAG-to-DAG operations for branch
				1192	instructions. Later,
				1193	the code in <tt>XXXISelLowering.cpp</tt> replaces or removes operations and data types
				1194	not supported natively (legalizes) in a Selection DAG. </p>
Misha Brukman	8eb6719	2004-09-06 22:58:13 +0000	[diff] [blame]	1195
Chris Lattner	7897538	2008-11-11 19:30:41 +0000	[diff] [blame^]	1196	<p>TableGen generates code for instruction selection using the
				1197	following target description input files:</p>
Misha Brukman	8eb6719	2004-09-06 22:58:13 +0000	[diff] [blame]	1198	<ul>
Chris Lattner	7897538	2008-11-11 19:30:41 +0000	[diff] [blame^]	1199	<li><tt>XXXInstrInfo.td</tt> contains definitions of instructions in a
				1200	target-specific instruction set, generates <tt>XXXGenDAGISel.inc</tt>, which is included
				1201	in <tt>XXXISelDAGToDAG.cpp</tt>. </li>
				1202
				1203	<li><tt>XXXCallingConv.td</tt> contains the calling and return value conventions
				1204	for the target architecture, and it generates <tt>XXXGenCallingConv.inc</tt>, which is
				1205	included in <tt>XXXISelLowering.cpp</tt>.</li>
Misha Brukman	8eb6719	2004-09-06 22:58:13 +0000	[diff] [blame]	1206	</ul>
				1207
Chris Lattner	7897538	2008-11-11 19:30:41 +0000	[diff] [blame^]	1208	<p>The implementation of an instruction selection pass must include
				1209	a header that declares the FunctionPass class or a subclass of FunctionPass. In
				1210	<tt>XXXTargetMachine.cpp</tt>, a Pass Manager (PM) should add each instruction selection
				1211	pass into the queue of passes to run.</p>
				1212
				1213	<p>The LLVM static
				1214	compiler (<tt>llc</tt>) is an excellent tool for visualizing the contents of DAGs. To display
				1215	the SelectionDAG before or after specific processing phases, use the command
				1216	line options for <tt>llc</tt>, described at <a
				1217	href="http://llvm.org/docs/CodeGenerator.html#selectiondag_process">
				1218	SelectionDAG Instruction Selection Process</a>.
				1219	</p>
				1220
				1221	<p>To describe instruction selector behavior, you should add
				1222	patterns for lowering LLVM code into a SelectionDAG as the last parameter of
				1223	the instruction definitions in <tt>XXXInstrInfo.td</tt>. For example, in
				1224	<tt>SparcInstrInfo.td</tt>, this entry defines a register store operation, and the last
				1225	parameter describes a pattern with the store DAG operator.</p>
				1226	</div>
				1227
				1228	<div class="doc_code">
				1229	<pre>def STrr : F3_1< 3, 0b000100, (outs), (ins MEMrr:$addr, IntRegs:$src),
				1230	"st $src, [$addr]", [(store IntRegs:$src, ADDRrr:$addr)]>;
				1231	</pre>
				1232	</div>
				1233
				1234	<div class="doc_text">
				1235	<p>ADDRrr is a memory mode that is also defined in <tt>SparcInstrInfo.td</tt>:</p>
				1236	</div>
				1237
				1238	<div class="doc_code">
				1239	<pre>def ADDRrr : ComplexPattern<i32, 2, "SelectADDRrr", [], []>;
				1240	</pre>
				1241	</div>
				1242
				1243	<div class="doc_text">
				1244	<p>The definition of ADDRrr refers to SelectADDRrr, which is a function defined in an
				1245	implementation of the Instructor Selector (such as <tt>SparcISelDAGToDAG.cpp</tt>). </p>
				1246
				1247	<p>In <tt>lib/Target/TargetSelectionDAG.td</tt>, the DAG operator for store
				1248	is defined below:</p>
				1249	</div>
				1250
				1251	<div class="doc_code">
				1252	<pre>def store : PatFrag<(ops node:$val, node:$ptr),
				1253	(st node:$val, node:$ptr), [{
				1254	if (StoreSDNode *ST = dyn_cast<StoreSDNode>(N))
				1255	return !ST->isTruncatingStore() &&
				1256	ST->getAddressingMode() == ISD::UNINDEXED;
				1257	return false;
				1258	}]>;
				1259	</pre>
				1260	</div>
				1261	<div class="doc_text">
				1262	<p><tt>XXXInstrInfo.td</tt> also generates (in <tt>XXXGenDAGISel.inc</tt>) the
				1263	<tt>SelectCode</tt> method that is used to call the appropriate processing method for an
				1264	instruction. In this example, <tt>SelectCode</tt> calls <tt>Select_ISD_STORE</tt> for the
				1265	ISD::STORE opcode.</p>
				1266	</div>
				1267
				1268	<div class="doc_code">
				1269	<pre>SDNode *SelectCode(SDOperand N) {
				1270	...
				1271	MVT::ValueType NVT = N.Val->getValueType(0);
				1272	switch (N.getOpcode()) {
				1273	case ISD::STORE: {
				1274	switch (NVT) {
				1275	default:
				1276	return Select_ISD_STORE(N);
				1277	break;
				1278	}
				1279	break;
				1280	}
				1281	...
				1282	</pre>
				1283	</div>
				1284	<div class="doc_text">
				1285	<p>The pattern for STrr is matched, so elsewhere in
				1286	<tt>XXXGenDAGISel.inc</tt>, code for STrr is created for <tt>Select_ISD_STORE</tt>. The <tt>Emit_22</tt> method
				1287	is also generated in <tt>XXXGenDAGISel.inc</tt> to complete the processing of this
				1288	instruction. </p>
				1289	</div>
				1290
				1291	<div class="doc_code">
				1292	<pre>SDNode *Select_ISD_STORE(const SDOperand &N) {
				1293	SDOperand Chain = N.getOperand(0);
				1294	if (Predicate_store(N.Val)) {
				1295	SDOperand N1 = N.getOperand(1);
				1296	SDOperand N2 = N.getOperand(2);
				1297	SDOperand CPTmp0;
				1298	SDOperand CPTmp1;
				1299
				1300	// Pattern: (st:void IntRegs:i32:$src,
				1301	// ADDRrr:i32:$addr)<<P:Predicate_store>>
				1302	// Emits: (STrr:void ADDRrr:i32:$addr, IntRegs:i32:$src)
				1303	// Pattern complexity = 13 cost = 1 size = 0
				1304	if (SelectADDRrr(N, N2, CPTmp0, CPTmp1) &&
				1305	N1.Val->getValueType(0) == MVT::i32 &&
				1306	N2.Val->getValueType(0) == MVT::i32) {
				1307	return Emit_22(N, SP::STrr, CPTmp0, CPTmp1);
				1308	}
				1309	...
				1310	</pre>
				1311	</div>
				1312
				1313	<!-- ======================================================================= -->
				1314	<div class="doc_subsection">
				1315	<a name="LegalizePhase">The SelectionDAG Legalize Phase</a>
				1316	</div>
				1317	<div class="doc_text">
				1318	<p>The Legalize phase converts a DAG to use types and operations
				1319	that are natively supported by the target. For natively unsupported types and
				1320	operations, you need to add code to the target-specific XXXTargetLowering implementation
				1321	to convert unsupported types and operations to supported ones.</p>
				1322
				1323	<p>In the constructor for the XXXTargetLowering class, first use the
				1324	<tt>addRegisterClass</tt> method to specify which types are supports and which register
				1325	classes are associated with them. The code for the register classes are generated
				1326	by TableGen from <tt>XXXRegisterInfo.td</tt> and placed in <tt>XXXGenRegisterInfo.h.inc</tt>. For
				1327	example, the implementation of the constructor for the SparcTargetLowering
				1328	class (in <tt>SparcISelLowering.cpp</tt>) starts with the following code:</p>
				1329	</div>
				1330
				1331	<div class="doc_code">
				1332	<pre>addRegisterClass(MVT::i32, SP::IntRegsRegisterClass);
				1333	addRegisterClass(MVT::f32, SP::FPRegsRegisterClass);
				1334	addRegisterClass(MVT::f64, SP::DFPRegsRegisterClass);
				1335	</pre>
				1336	</div>
				1337
				1338	<div class="doc_text">
				1339	<p>You should examine the node types in the ISD namespace
				1340	(<tt>include/llvm/CodeGen/SelectionDAGNodes.h</tt>)
				1341	and determine which operations the target natively supports. For operations
				1342	that do <u>not</u> have native support, add a callback to the constructor for
				1343	the XXXTargetLowering class, so the instruction selection process knows what to
				1344	do. The TargetLowering class callback methods (declared in
				1345	<tt>llvm/Target/TargetLowering.h</tt>) are:</p>
				1346	<ul>
				1347	<li><tt>setOperationAction</tt> (general operation)</li>
				1348
				1349	<li><tt>setLoadExtAction</tt> (load with extension)</li>
				1350
				1351	<li><tt>setTruncStoreAction</tt> (truncating store)</li>
				1352
				1353	<li><tt>setIndexedLoadAction</tt> (indexed load)</li>
				1354
				1355	<li><tt>setIndexedStoreAction</tt> (indexed store)</li>
				1356
				1357	<li><tt>setConvertAction</tt> (type conversion)</li>
				1358
				1359	<li><tt>setCondCodeAction</tt> (support for a given condition code)</li>
				1360	</ul>
				1361
				1362	<p>Note: on older releases, <tt>setLoadXAction</tt> is used instead of <tt>setLoadExtAction</tt>.
				1363	Also, on older releases, <tt>setCondCodeAction</tt> may not be supported. Examine your
				1364	release to see what methods are specifically supported.</p>
				1365
				1366	<p>These callbacks are used to determine that an operation does or
				1367	does not work with a specified type (or types). And in all cases, the third
				1368	parameter is a LegalAction type enum value: <tt>Promote</tt>, <tt>Expand</tt>,
				1369	<tt>Custom</tt>, or <tt>Legal</tt>. <tt>SparcISelLowering.cpp</tt>
				1370	contains examples of all four LegalAction values.</p>
				1371	</div>
				1372
				1373	<!-- _______________________________________________________________________ -->
				1374	<div class="doc_subsubsection">
				1375	<a name="promote">Promote</a>
				1376	</div>
				1377
				1378	<div class="doc_text">
				1379	<p>For an operation without native support for a given type, the
				1380	specified type may be promoted to a larger type that is supported. For example,
				1381	SPARC does not support a sign-extending load for Boolean values (<tt>i1</tt> type), so
				1382	in <tt>SparcISelLowering.cpp</tt> the third
				1383	parameter below, <tt>Promote</tt>, changes <tt>i1</tt> type
				1384	values to a large type before loading.</p>
				1385	</div>
				1386
				1387	<div class="doc_code">
				1388	<pre>setLoadExtAction(ISD::SEXTLOAD, MVT::i1, Promote);
				1389	</pre>
				1390	</div>
				1391
				1392	<!-- _______________________________________________________________________ -->
				1393	<div class="doc_subsubsection">
				1394	<a name="expand">Expand</a>
				1395	</div>
				1396	<div class="doc_text">
				1397	<p>For a type without native support, a value may need to be broken
				1398	down further, rather than promoted. For an operation without native support, a
				1399	combination of other operations may be used to similar effect. In SPARC, the
				1400	floating-point sine and cosine trig operations are supported by expansion to
				1401	other operations, as indicated by the third parameter, <tt>Expand</tt>, to
				1402	<tt>setOperationAction</tt>:</p>
				1403	</div>
				1404
				1405	<div class="doc_code">
				1406	<pre>setOperationAction(ISD::FSIN, MVT::f32, Expand);
				1407	setOperationAction(ISD::FCOS, MVT::f32, Expand);
				1408	</pre>
				1409	</div>
				1410
				1411	<!-- _______________________________________________________________________ -->
				1412	<div class="doc_subsubsection">
				1413	<a name="custom">Custom</a>
				1414	</div>
				1415	<div class="doc_text">
				1416	<p>For some operations, simple type promotion or operation expansion
				1417	may be insufficient. In some cases, a special intrinsic function must be
				1418	implemented. </p>
				1419
				1420	<p>For example, a constant value may require special treatment, or
				1421	an operation may require spilling and restoring registers in the stack and
				1422	working with register allocators. </p>
				1423
				1424	<p>As seen in <tt>SparcISelLowering.cpp</tt> code below, to perform a type
				1425	conversion from a floating point value to a signed integer, first the
				1426	<tt>setOperationAction</tt> should be called with <tt>Custom</tt> as the third parameter:</p>
				1427	</div>
				1428
				1429	<div class="doc_code">
				1430	<pre>setOperationAction(ISD::FP_TO_SINT, MVT::i32, Custom);
				1431	</pre>
				1432	</div>
				1433	<div class="doc_text">
				1434	<p>In the <tt>LowerOperation</tt> method, for each <tt>Custom</tt> operation, a case
				1435	statement should be added to indicate what function to call. In the following
				1436	code, an FP_TO_SINT opcode will call the <tt>LowerFP_TO_SINT</tt> method:</p>
				1437	</div>
				1438
				1439	<div class="doc_code">
				1440	<pre>SDOperand SparcTargetLowering::LowerOperation(
				1441	SDOperand Op, SelectionDAG &DAG) {
				1442	switch (Op.getOpcode()) {
				1443	case ISD::FP_TO_SINT: return LowerFP_TO_SINT(Op, DAG);
				1444	...
				1445	}
				1446	}
				1447	</pre>
				1448	</div>
				1449	<div class="doc_text">
				1450	<p>Finally, the <tt>LowerFP_TO_SINT</tt> method is implemented, using an FP
				1451	register to convert the floating-point value to an integer.</p>
				1452	</div>
				1453
				1454	<div class="doc_code">
				1455	<pre>static SDOperand LowerFP_TO_SINT(SDOperand Op, SelectionDAG &DAG) {
				1456	assert(Op.getValueType() == MVT::i32);
				1457	Op = DAG.getNode(SPISD::FTOI, MVT::f32, Op.getOperand(0));
				1458	return DAG.getNode(ISD::BIT_CONVERT, MVT::i32, Op);
				1459	}
				1460	</pre>
				1461	</div>
				1462	<!-- _______________________________________________________________________ -->
				1463	<div class="doc_subsubsection">
				1464	<a name="legal">Legal</a>
				1465	</div>
				1466	<div class="doc_text">
				1467	<p>The <tt>Legal</tt> LegalizeAction enum value simply indicates that an
				1468	operation <u>is</u> natively supported. <tt>Legal</tt> represents the default condition,
				1469	so it is rarely used. In <tt>SparcISelLowering.cpp</tt>, the action for CTPOP (an
				1470	operation to count the bits set in an integer) is natively supported only for
				1471	SPARC v9. The following code enables the <tt>Expand</tt> conversion technique for non-v9
				1472	SPARC implementations.</p>
				1473	</div>
				1474
				1475	<div class="doc_code">
				1476	<pre>setOperationAction(ISD::CTPOP, MVT::i32, Expand);
				1477	...
				1478	if (TM.getSubtarget<SparcSubtarget>().isV9())
				1479	setOperationAction(ISD::CTPOP, MVT::i32, Legal);
				1480	case ISD::SETULT: return SPCC::ICC_CS;
				1481	case ISD::SETULE: return SPCC::ICC_LEU;
				1482	case ISD::SETUGT: return SPCC::ICC_GU;
				1483	case ISD::SETUGE: return SPCC::ICC_CC;
				1484	}
				1485	}
				1486	</pre>
				1487	</div>
				1488	<!-- ======================================================================= -->
				1489	<div class="doc_subsection">
				1490	<a name="callingConventions">Calling Conventions</a>
				1491	</div>
				1492	<div class="doc_text">
				1493	<p>To support target-specific calling conventions, <tt>XXXGenCallingConv.td</tt>
				1494	uses interfaces (such as CCIfType and CCAssignToReg) that are defined in
				1495	<tt>lib/Target/TargetCallingConv.td</tt>. TableGen can take the target descriptor file
				1496	<tt>XXXGenCallingConv.td</tt> and generate the header file <tt>XXXGenCallingConv.inc</tt>, which
				1497	is typically included in <tt>XXXISelLowering.cpp</tt>. You can use the interfaces in
				1498	<tt>TargetCallingConv.td</tt> to specify:</p>
				1499	<ul>
				1500	<li>the order of parameter allocation</li>
				1501
				1502	<li>where parameters and return values are placed (that is, on the
				1503	stack or in registers)</li>
				1504
				1505	<li>which registers may be used</li>
				1506
				1507	<li>whether the caller or callee unwinds the stack</li>
				1508	</ul>
				1509
				1510	<p>The following example demonstrates the use of the CCIfType and
				1511	CCAssignToReg interfaces. If the CCIfType predicate is true (that is, if the
				1512	current argument is of type f32 or f64), then the action is performed. In this
				1513	case, the CCAssignToReg action assigns the argument value to the first
				1514	available register: either R0 or R1. </p>
				1515	</div>
				1516	<div class="doc_code">
				1517	<pre>CCIfType<[f32,f64], CCAssignToReg<[R0, R1]>>
				1518	</pre>
				1519	</div>
				1520	<div class="doc_text">
				1521	<p><tt>SparcCallingConv.td</tt> contains definitions for a target-specific return-value
				1522	calling convention (RetCC_Sparc32) and a basic 32-bit C calling convention
				1523	(CC_Sparc32). The definition of RetCC_Sparc32 (shown below) indicates which
				1524	registers are used for specified scalar return types. A single-precision float
				1525	is returned to register F0, and a double-precision float goes to register D0. A
				1526	32-bit integer is returned in register I0 or I1. </p>
				1527	</div>
				1528
				1529	<div class="doc_code">
				1530	<pre>def RetCC_Sparc32 : CallingConv<[
				1531	CCIfType<[i32], CCAssignToReg<[I0, I1]>>,
				1532	CCIfType<[f32], CCAssignToReg<[F0]>>,
				1533	CCIfType<[f64], CCAssignToReg<[D0]>>
				1534	]>;
				1535	</pre>
				1536	</div>
				1537	<div class="doc_text">
				1538	<p>The definition of CC_Sparc32 in <tt>SparcCallingConv.td</tt> introduces
				1539	CCAssignToStack, which assigns the value to a stack slot with the specified size
				1540	and alignment. In the example below, the first parameter, 4, indicates the size
				1541	of the slot, and the second parameter, also 4, indicates the stack alignment
				1542	along 4-byte units. (Special cases: if size is zero, then the ABI size is used;
				1543	if alignment is zero, then the ABI alignment is used.) </p>
				1544	</div>
				1545
				1546	<div class="doc_code">
				1547	<pre>def CC_Sparc32 : CallingConv<[
				1548	// All arguments get passed in integer registers if there is space.
				1549	CCIfType<[i32, f32, f64], CCAssignToReg<[I0, I1, I2, I3, I4, I5]>>,
				1550	CCAssignToStack<4, 4>
				1551	]>;
				1552	</pre>
				1553	</div>
				1554	<div class="doc_text">
				1555	<p>CCDelegateTo is another commonly used interface, which tries to find
				1556	a specified sub-calling convention and, if a match is found, it is invoked. In
				1557	the following example (in <tt>X86CallingConv.td</tt>), the definition of RetCC_X86_32_C
				1558	ends with CCDelegateTo. After the current value is assigned to the register ST0
				1559	or ST1, the RetCC_X86Common is invoked.</p>
				1560	</div>
				1561
				1562	<div class="doc_code">
				1563	<pre>def RetCC_X86_32_C : CallingConv<[
				1564	CCIfType<[f32], CCAssignToReg<[ST0, ST1]>>,
				1565	CCIfType<[f64], CCAssignToReg<[ST0, ST1]>>,
				1566	CCDelegateTo<RetCC_X86Common>
				1567	]>;
				1568	</pre>
				1569	</div>
				1570	<div class="doc_text">
				1571	<p>CCIfCC is an interface that attempts to match the given name to
				1572	the current calling convention. If the name identifies the current calling
				1573	convention, then a specified action is invoked. In the following example (in
				1574	<tt>X86CallingConv.td</tt>), if the Fast calling convention is in use, then RetCC_X86_32_Fast
				1575	is invoked. If the SSECall calling convention is in use, then RetCC_X86_32_SSE
				1576	is invoked. </p>
				1577	</div>
				1578
				1579	<div class="doc_code">
				1580	<pre>def RetCC_X86_32 : CallingConv<[
				1581	CCIfCC<"CallingConv::Fast", CCDelegateTo<RetCC_X86_32_Fast>>,
				1582	CCIfCC<"CallingConv::X86_SSECall", CCDelegateTo<RetCC_X86_32_SSE>>,
				1583	CCDelegateTo<RetCC_X86_32_C>
				1584	]>;
				1585	</pre>
				1586	</div>
				1587	<div class="doc_text">
				1588	<p>Other calling convention interfaces include:</p>
				1589	<ul>
				1590	<li>CCIf <predicate, action> - if the predicate matches, apply
				1591	the action</li>
				1592
				1593	<li>CCIfInReg <action> - if the argument is marked with the
				1594	‘inreg’ attribute, then apply the action </li>
				1595
				1596	<li>CCIfNest <action> - if the argument is marked with the
				1597	‘nest’ attribute, then apply the action</li>
				1598
				1599	<li>CCIfNotVarArg <action> - if the current function does not
				1600	take a variable number of arguments, apply the action</li>
				1601
				1602	<li>CCAssignToRegWithShadow <registerList, shadowList> -
				1603	similar to CCAssignToReg, but with a shadow list of registers</li>
				1604
				1605	<li>CCPassByVal <size, align> - assign value to a stack slot
				1606	with the minimum specified size and alignment </li>
				1607
				1608	<li>CCPromoteToType <type> - promote the current value to the specified
				1609	type</li>
				1610
				1611	<li>CallingConv <[actions]> - define each calling convention
				1612	that is supported</li>
				1613	</ul>
				1614	</div>
				1615
				1616	<!-- *********************************************************************** -->
				1617	<div class="doc_section">
				1618	<a name="assemblyPrinter">Assembly Printer</a>
				1619	</div>
				1620	<!-- *********************************************************************** -->
				1621
				1622	<div class="doc_text">
				1623	<p>During the code
				1624	emission stage, the code generator may utilize an LLVM pass to produce assembly
				1625	output. To do this, you want to implement the code for a printer that converts
				1626	LLVM IR to a GAS-format assembly language for your target machine, using the
				1627	following steps:</p>
				1628	<ul>
				1629	<li>Define all the assembly strings for your target, adding them to
				1630	the instructions defined in the <tt>XXXInstrInfo.td</tt> file.
				1631	(See <a href="#InstructionSet">Instruction Set</a>.)
				1632	TableGen will produce an output file (<tt>XXXGenAsmWriter.inc</tt>) with an
				1633	implementation of the <tt>printInstruction</tt> method for the XXXAsmPrinter class.</li>
				1634
				1635	<li>Write <tt>XXXTargetAsmInfo.h</tt>, which contains the bare-bones
				1636	declaration of the XXXTargetAsmInfo class (a subclass of TargetAsmInfo). </li>
				1637
				1638	<li>Write <tt>XXXTargetAsmInfo.cpp</tt>, which contains target-specific values
				1639	for TargetAsmInfo properties and sometimes new implementations for methods</li>
				1640
				1641	<li>Write <tt>XXXAsmPrinter.cpp</tt>, which implements the AsmPrinter class
				1642	that performs the LLVM-to-assembly conversion. </li>
				1643	</ul>
				1644
				1645	<p>The code in <tt>XXXTargetAsmInfo.h</tt> is usually a trivial declaration
				1646	of the XXXTargetAsmInfo class for use in <tt>XXXTargetAsmInfo.cpp</tt>. Similarly,
				1647	<tt>XXXTargetAsmInfo.cpp</tt> usually has a few declarations of XXXTargetAsmInfo replacement
				1648	values that override the default values in <tt>TargetAsmInfo.cpp</tt>. For example in
				1649	<tt>SparcTargetAsmInfo.cpp</tt>, </p>
				1650	</div>
				1651
				1652	<div class="doc_code">
				1653	<pre>SparcTargetAsmInfo::SparcTargetAsmInfo(const SparcTargetMachine &TM) {
				1654	Data16bitsDirective = "\t.half\t";
				1655	Data32bitsDirective = "\t.word\t";
				1656	Data64bitsDirective = 0; // .xword is only supported by V9.
				1657	ZeroDirective = "\t.skip\t";
				1658	CommentString = "!";
				1659	ConstantPoolSection = "\t.section \".rodata\",#alloc\n";
				1660	}
				1661	</pre>
				1662	</div>
				1663	<div class="doc_text">
				1664	<p>The X86 assembly printer implementation (X86TargetAsmInfo) is an
				1665	example where the target specific TargetAsmInfo class uses overridden methods:
				1666	<tt>ExpandInlineAsm</tt> and <tt>PreferredEHDataFormat</tt>. </p>
				1667
				1668	<p>A target-specific implementation of AsmPrinter is written in
				1669	<tt>XXXAsmPrinter.cpp</tt>, which implements the AsmPrinter class that converts the LLVM
				1670	to printable assembly. The implementation must include the following headers
				1671	that have declarations for the AsmPrinter and MachineFunctionPass classes. The
				1672	MachineFunctionPass is a subclass of FunctionPass. </p>
				1673	</div>
				1674
				1675	<div class="doc_code">
				1676	<pre>#include "llvm/CodeGen/AsmPrinter.h"
				1677	#include "llvm/CodeGen/MachineFunctionPass.h"
				1678	</pre>
				1679	</div>
				1680
				1681	<div class="doc_text">
				1682	<p>As a FunctionPass, AsmPrinter first calls <tt>doInitialization</tt> to set
				1683	up the AsmPrinter. In SparcAsmPrinter, a Mangler object is instantiated to
				1684	process variable names.</p>
				1685
				1686	<p>In <tt>XXXAsmPrinter.cpp</tt>, the <tt>runOnMachineFunction</tt> method (declared
				1687	in MachineFunctionPass) must be implemented for XXXAsmPrinter. In
				1688	MachineFunctionPass, the <tt>runOnFunction</tt> method invokes <tt>runOnMachineFunction</tt>.
				1689	Target-specific implementations of <tt>runOnMachineFunction</tt> differ, but generally
				1690	do the following to process each machine function:</p>
				1691	<ul>
				1692	<li>call <tt>SetupMachineFunction</tt> to perform initialization</li>
				1693
				1694	<li>call <tt>EmitConstantPool</tt> to print out (to the output stream)
				1695	constants which have been spilled to memory </li>
				1696
				1697	<li>call <tt>EmitJumpTableInfo</tt> to print out jump tables used by the
				1698	current function </li>
				1699
				1700	<li>print out the label for the current function</li>
				1701
				1702	<li>print out the code for the function, including basic block labels
				1703	and the assembly for the instruction (using <tt>printInstruction</tt>)</li>
				1704	</ul>
				1705	<p>The XXXAsmPrinter implementation must also include the code
				1706	generated by TableGen that is output in the <tt>XXXGenAsmWriter.inc</tt> file. The code
				1707	in <tt>XXXGenAsmWriter.inc</tt> contains an implementation of the <tt>printInstruction</tt>
				1708	method that may call these methods:</p>
				1709	<ul>
				1710	<li><tt>printOperand</tt></li>
				1711
				1712	<li><tt>printMemOperand</tt></li>
				1713
				1714	<li><tt>printCCOperand (for conditional statements)</tt></li>
				1715
				1716	<li><tt>printDataDirective</tt></li>
				1717
				1718	<li><tt>printDeclare</tt></li>
				1719
				1720	<li><tt>printImplicitDef</tt></li>
				1721
				1722	<li><tt>printInlineAsm</tt></li>
				1723
				1724	<li><tt>printLabel</tt></li>
				1725
				1726	<li><tt>printPICJumpTableEntry</tt></li>
				1727
				1728	<li><tt>printPICJumpTableSetLabel</tt></li>
				1729	</ul>
				1730
				1731	<p>The implementations of <tt>printDeclare</tt>, <tt>printImplicitDef</tt>,
				1732	<tt>printInlineAsm</tt>, and <tt>printLabel</tt> in <tt>AsmPrinter.cpp</tt> are generally adequate for
				1733	printing assembly and do not need to be overridden. (<tt>printBasicBlockLabel</tt> is
				1734	another method that is implemented in <tt>AsmPrinter.cpp</tt> that may be directly used
				1735	in an implementation of XXXAsmPrinter.)</p>
				1736
				1737	<p>The <tt>printOperand</tt> method is implemented with a long switch/case
				1738	statement for the type of operand: register, immediate, basic block, external
				1739	symbol, global address, constant pool index, or jump table index. For an
				1740	instruction with a memory address operand, the <tt>printMemOperand</tt> method should be
				1741	implemented to generate the proper output. Similarly, <tt>printCCOperand</tt> should be
				1742	used to print a conditional operand. </p>
				1743
				1744	<p><tt>doFinalization</tt> should be overridden in XXXAsmPrinter, and
				1745	it should be called to shut down the assembly printer. During <tt>doFinalization</tt>,
				1746	global variables and constants are printed to output.</p>
				1747	</div>
				1748	<!-- *********************************************************************** -->
				1749	<div class="doc_section">
				1750	<a name="subtargetSupport">Subtarget Support</a>
				1751	</div>
				1752	<!-- *********************************************************************** -->
				1753
				1754	<div class="doc_text">
				1755	<p>Subtarget support is used to inform the code generation process
				1756	of instruction set variations for a given chip set. For example, the LLVM
				1757	SPARC implementation provided covers three major versions of the SPARC
				1758	microprocessor architecture: Version 8 (V8, which is a 32-bit architecture),
				1759	Version 9 (V9, a 64-bit architecture), and the UltraSPARC architecture. V8 has
				1760	16 double-precision floating-point registers that are also usable as either 32
				1761	single-precision or 8 quad-precision registers. V8 is also purely big-endian. V9
				1762	has 32 double-precision floating-point registers that are also usable as 16
				1763	quad-precision registers, but cannot be used as single-precision registers. The
				1764	UltraSPARC architecture combines V9 with UltraSPARC Visual Instruction Set
				1765	extensions.</p>
				1766
				1767	<p>If subtarget support is needed, you should implement a
				1768	target-specific XXXSubtarget class for your architecture. This class should
				1769	process the command-line options <tt>–mcpu=</tt> and <tt>–mattr=</tt></p>
				1770
				1771	<p>TableGen uses definitions in the <tt>Target.td</tt> and <tt>Sparc.td</tt> files to
				1772	generate code in <tt>SparcGenSubtarget.inc</tt>. In <tt>Target.td</tt>, shown below, the
				1773	SubtargetFeature interface is defined. The first 4 string parameters of the
				1774	SubtargetFeature interface are a feature name, an attribute set by the feature,
				1775	the value of the attribute, and a description of the feature. (The fifth
				1776	parameter is a list of features whose presence is implied, and its default
				1777	value is an empty array.)</p>
				1778	</div>
				1779
				1780	<div class="doc_code">
				1781	<pre>class SubtargetFeature<string n, string a, string v, string d,
				1782	list<SubtargetFeature> i = []> {
				1783	string Name = n;
				1784	string Attribute = a;
				1785	string Value = v;
				1786	string Desc = d;
				1787	list<SubtargetFeature> Implies = i;
				1788	}
				1789	</pre>
				1790	</div>
				1791	<div class="doc_text">
				1792	<p>In the <tt>Sparc.td</tt> file, the SubtargetFeature is used to define the
				1793	following features. </p>
				1794	</div>
				1795
				1796	<div class="doc_code">
				1797	<pre>def FeatureV9 : SubtargetFeature<"v9", "IsV9", "true",
				1798	"Enable SPARC-V9 instructions">;
				1799	def FeatureV8Deprecated : SubtargetFeature<"deprecated-v8",
				1800	"V8DeprecatedInsts", "true",
				1801	"Enable deprecated V8 instructions in V9 mode">;
				1802	def FeatureVIS : SubtargetFeature<"vis", "IsVIS", "true",
				1803	"Enable UltraSPARC Visual Instruction Set extensions">;
				1804	</pre>
				1805	</div>
				1806
				1807	<div class="doc_text">
				1808	<p>Elsewhere in <tt>Sparc.td</tt>, the Proc class is defined and then is used
				1809	to define particular SPARC processor subtypes that may have the previously
				1810	described features. </p>
				1811	</div>
				1812
				1813	<div class="doc_code">
				1814	<pre>class Proc<string Name, list<SubtargetFeature> Features>
				1815	: Processor<Name, NoItineraries, Features>;
				1816
				1817	def : Proc<"generic", []>;
				1818	def : Proc<"v8", []>;
				1819	def : Proc<"supersparc", []>;
				1820	def : Proc<"sparclite", []>;
				1821	def : Proc<"f934", []>;
				1822	def : Proc<"hypersparc", []>;
				1823	def : Proc<"sparclite86x", []>;
				1824	def : Proc<"sparclet", []>;
				1825	def : Proc<"tsc701", []>;
				1826	def : Proc<"v9", [FeatureV9]>;
				1827	def : Proc<"ultrasparc", [FeatureV9, FeatureV8Deprecated]>;
				1828	def : Proc<"ultrasparc3", [FeatureV9, FeatureV8Deprecated]>;
				1829	def : Proc<"ultrasparc3-vis", [FeatureV9, FeatureV8Deprecated, FeatureVIS]>;
				1830	</pre>
				1831	</div>
				1832
				1833	<div class="doc_text">
				1834	<p>From <tt>Target.td</tt> and <tt>Sparc.td</tt> files, the resulting
				1835	SparcGenSubtarget.inc specifies enum values to identify the features, arrays of
				1836	constants to represent the CPU features and CPU subtypes, and the
				1837	ParseSubtargetFeatures method that parses the features string that sets
				1838	specified subtarget options. The generated <tt>SparcGenSubtarget.inc</tt> file should be
				1839	included in the <tt>SparcSubtarget.cpp</tt>. The target-specific implementation of the XXXSubtarget
				1840	method should follow this pseudocode:</p>
				1841	</div>
				1842
				1843	<div class="doc_code">
				1844	<pre>XXXSubtarget::XXXSubtarget(const Module &M, const std::string &FS) {
				1845	// Set the default features
				1846	// Determine default and user specified characteristics of the CPU
				1847	// Call ParseSubtargetFeatures(FS, CPU) to parse the features string
				1848	// Perform any additional operations
				1849	}
				1850	</pre>
				1851	</div>
				1852
				1853	<!-- *********************************************************************** -->
				1854	<div class="doc_section">
				1855	<a name="jitSupport">JIT Support</a>
				1856	</div>
				1857	<!-- *********************************************************************** -->
				1858
				1859	<div class="doc_text">
				1860	<p>The implementation of a target machine optionally includes a Just-In-Time
				1861	(JIT) code generator that emits machine code and auxiliary structures as binary
				1862	output that can be written directly to memory.
				1863	To do this, implement JIT code generation by performing the following
				1864	steps:</p>
				1865	<ul>
				1866	<li>Write an <tt>XXXCodeEmitter.cpp</tt> file that contains a machine function
				1867	pass that transforms target-machine instructions into relocatable machine code.</li>
				1868
				1869	<li>Write an <tt>XXXJITInfo.cpp</tt> file that implements the JIT interfaces
				1870	for target-specific code-generation
				1871	activities, such as emitting machine code and stubs. </li>
				1872
				1873	<li>Modify XXXTargetMachine so that it provides a TargetJITInfo
				1874	object through its <tt>getJITInfo</tt> method. </li>
				1875	</ul>
				1876
				1877	<p>There are several different approaches to writing the JIT support
				1878	code. For instance, TableGen and target descriptor files may be used for
				1879	creating a JIT code generator, but are not mandatory. For the Alpha and PowerPC
				1880	target machines, TableGen is used to generate <tt>XXXGenCodeEmitter.inc</tt>, which
				1881	contains the binary coding of machine instructions and the
				1882	<tt>getBinaryCodeForInstr</tt> method to access those codes. Other JIT implementations
				1883	do not.</p>
				1884
				1885	<p>Both <tt>XXXJITInfo.cpp</tt> and <tt>XXXCodeEmitter.cpp</tt> must include the
				1886	<tt>llvm/CodeGen/MachineCodeEmitter.h</tt> header file that defines the MachineCodeEmitter
				1887	class containing code for several callback functions that write data (in bytes,
				1888	words, strings, etc.) to the output stream.</p>
				1889	</div>
				1890	<!-- ======================================================================= -->
				1891	<div class="doc_subsection">
				1892	<a name="mce">Machine Code Emitter</a>
				1893	</div>
				1894
				1895	<div class="doc_text">
				1896	<p>In <tt>XXXCodeEmitter.cpp</tt>, a target-specific of the Emitter class is
				1897	implemented as a function pass (subclass of MachineFunctionPass). The
				1898	target-specific implementation of <tt>runOnMachineFunction</tt> (invoked by
				1899	<tt>runOnFunction</tt> in MachineFunctionPass) iterates through the MachineBasicBlock
				1900	calls <tt>emitInstruction</tt> to process each instruction and emit binary code. <tt>emitInstruction</tt>
				1901	is largely implemented with case statements on the instruction types defined in
				1902	<tt>XXXInstrInfo.h</tt>. For example, in <tt>X86CodeEmitter.cpp</tt>, the <tt>emitInstruction</tt> method
				1903	is built around the following switch/case statements:</p>
				1904	</div>
				1905
				1906	<div class="doc_code">
				1907	<pre>switch (Desc->TSFlags & X86::FormMask) {
				1908	case X86II::Pseudo: // for not yet implemented instructions
				1909	... // or pseudo-instructions
				1910	break;
				1911	case X86II::RawFrm: // for instructions with a fixed opcode value
				1912	...
				1913	break;
				1914	case X86II::AddRegFrm: // for instructions that have one register operand
				1915	... // added to their opcode
				1916	break;
				1917	case X86II::MRMDestReg:// for instructions that use the Mod/RM byte
				1918	... // to specify a destination (register)
				1919	break;
				1920	case X86II::MRMDestMem:// for instructions that use the Mod/RM byte
				1921	... // to specify a destination (memory)
				1922	break;
				1923	case X86II::MRMSrcReg: // for instructions that use the Mod/RM byte
				1924	... // to specify a source (register)
				1925	break;
				1926	case X86II::MRMSrcMem: // for instructions that use the Mod/RM byte
				1927	... // to specify a source (memory)
				1928	break;
				1929	case X86II::MRM0r: case X86II::MRM1r: // for instructions that operate on
				1930	case X86II::MRM2r: case X86II::MRM3r: // a REGISTER r/m operand and
				1931	case X86II::MRM4r: case X86II::MRM5r: // use the Mod/RM byte and a field
				1932	case X86II::MRM6r: case X86II::MRM7r: // to hold extended opcode data
				1933	...
				1934	break;
				1935	case X86II::MRM0m: case X86II::MRM1m: // for instructions that operate on
				1936	case X86II::MRM2m: case X86II::MRM3m: // a MEMORY r/m operand and
				1937	case X86II::MRM4m: case X86II::MRM5m: // use the Mod/RM byte and a field
				1938	case X86II::MRM6m: case X86II::MRM7m: // to hold extended opcode data
				1939	...
				1940	break;
				1941	case X86II::MRMInitReg: // for instructions whose source and
				1942	... // destination are the same register
				1943	break;
				1944	}
				1945	</pre>
				1946	</div>
				1947	<div class="doc_text">
				1948	<p>The implementations of these case statements often first emit the
				1949	opcode and then get the operand(s). Then depending upon the operand, helper
				1950	methods may be called to process the operand(s). For example, in <tt>X86CodeEmitter.cpp</tt>,
				1951	for the <tt>X86II::AddRegFrm</tt> case, the first data emitted (by <tt>emitByte</tt>) is the
				1952	opcode added to the register operand. Then an object representing the machine
				1953	operand, MO1, is extracted. The helper methods such as <tt>isImmediate</tt>,
				1954	<tt>isGlobalAddress</tt>, <tt>isExternalSymbol</tt>, <tt>isConstantPoolIndex</tt>, and
				1955	<tt>isJumpTableIndex</tt>
				1956	determine the operand type. (<tt>X86CodeEmitter.cpp</tt> also has private methods such
				1957	as <tt>emitConstant</tt>, <tt>emitGlobalAddress</tt>,
				1958	<tt>emitExternalSymbolAddress</tt>, <tt>emitConstPoolAddress</tt>,
				1959	and <tt>emitJumpTableAddress</tt> that emit the data into the output stream.) </p>
				1960	</div>
				1961
				1962	<div class="doc_code">
				1963	<pre>case X86II::AddRegFrm:
				1964	MCE.emitByte(BaseOpcode + getX86RegNum(MI.getOperand(CurOp++).getReg()));
				1965
				1966	if (CurOp != NumOps) {
				1967	const MachineOperand &MO1 = MI.getOperand(CurOp++);
				1968	unsigned Size = X86InstrInfo::sizeOfImm(Desc);
				1969	if (MO1.isImmediate())
				1970	emitConstant(MO1.getImm(), Size);
				1971	else {
				1972	unsigned rt = Is64BitMode ? X86::reloc_pcrel_word
				1973	: (IsPIC ? X86::reloc_picrel_word : X86::reloc_absolute_word);
				1974	if (Opcode == X86::MOV64ri)
				1975	rt = X86::reloc_absolute_dword; // FIXME: add X86II flag?
				1976	if (MO1.isGlobalAddress()) {
				1977	bool NeedStub = isa<Function>(MO1.getGlobal());
				1978	bool isLazy = gvNeedsLazyPtr(MO1.getGlobal());
				1979	emitGlobalAddress(MO1.getGlobal(), rt, MO1.getOffset(), 0,
				1980	NeedStub, isLazy);
				1981	} else if (MO1.isExternalSymbol())
				1982	emitExternalSymbolAddress(MO1.getSymbolName(), rt);
				1983	else if (MO1.isConstantPoolIndex())
				1984	emitConstPoolAddress(MO1.getIndex(), rt);
				1985	else if (MO1.isJumpTableIndex())
				1986	emitJumpTableAddress(MO1.getIndex(), rt);
				1987	}
				1988	}
				1989	break;
				1990	</pre>
				1991	</div>
				1992	<div class="doc_text">
				1993	<p>In the previous example, <tt>XXXCodeEmitter.cpp</tt> uses the variable <tt>rt</tt>,
				1994	which is a RelocationType enum that may be used to relocate addresses (for
				1995	example, a global address with a PIC base offset). The RelocationType enum for
				1996	that target is defined in the short target-specific <tt>XXXRelocations.h</tt> file. The
				1997	RelocationType is used by the <tt>relocate</tt> method defined in <tt>XXXJITInfo.cpp</tt> to
				1998	rewrite addresses for referenced global symbols.</p>
				1999
				2000	<p>For example, <tt>X86Relocations.h</tt> specifies the following relocation
				2001	types for the X86 addresses. In all four cases, the relocated value is added to
				2002	the value already in memory. For <tt>reloc_pcrel_word</tt> and <tt>reloc_picrel_word</tt>,
				2003	there is an additional initial adjustment.</p>
				2004	</div>
				2005
				2006	<div class="doc_code">
				2007	<pre>enum RelocationType {
				2008	reloc_pcrel_word = 0, // add reloc value after adjusting for the PC loc
				2009	reloc_picrel_word = 1, // add reloc value after adjusting for the PIC base
				2010	reloc_absolute_word = 2, // absolute relocation; no additional adjustment
				2011	reloc_absolute_dword = 3 // absolute relocation; no additional adjustment
				2012	};
				2013	</pre>
				2014	</div>
				2015	<!-- ======================================================================= -->
				2016	<div class="doc_subsection">
				2017	<a name="targetJITInfo">Target JIT Info</a>
				2018	</div>
				2019	<div class="doc_text">
				2020	<p><tt>XXXJITInfo.cpp</tt> implements the JIT interfaces for target-specific code-generation
				2021	activities, such as emitting machine code and stubs. At minimum,
				2022	a target-specific version of XXXJITInfo implements the following:</p>
				2023	<ul>
				2024	<li><tt>getLazyResolverFunction</tt> – initializes the JIT, gives the
				2025	target a function that is used for compilation </li>
				2026
				2027	<li><tt>emitFunctionStub</tt> – returns a native function with a
				2028	specified address for a callback function</li>
				2029
				2030	<li><tt>relocate</tt> – changes the addresses of referenced globals,
				2031	based on relocation types</li>
				2032
				2033	<li>callback function that are wrappers to a function stub that is
				2034	used when the real target is not initially known </li>
				2035	</ul>
				2036
				2037	<p><tt>getLazyResolverFunction</tt> is generally trivial to implement. It
				2038	makes the incoming parameter as the global JITCompilerFunction and returns the
				2039	callback function that will be used a function wrapper. For the Alpha target
				2040	(in <tt>AlphaJITInfo.cpp</tt>), the <tt>getLazyResolverFunction</tt> implementation is simply:</p>
				2041	</div>
				2042
				2043	<div class="doc_code">
				2044	<pre>TargetJITInfo::LazyResolverFn AlphaJITInfo::getLazyResolverFunction(
				2045	JITCompilerFn F)
				2046	{
				2047	JITCompilerFunction = F;
				2048	return AlphaCompilationCallback;
				2049	}
				2050	</pre>
				2051	</div>
				2052	<div class="doc_text">
				2053	<p>For the X86 target, the <tt>getLazyResolverFunction</tt> implementation is
				2054	a little more complication, because it returns a different callback function
				2055	for processors with SSE instructions and XMM registers. </p>
				2056
				2057	<p>The callback function initially saves and later restores the
				2058	callee register values, incoming arguments, and frame and return address. The
				2059	callback function needs low-level access to the registers or stack, so it is typically
				2060	implemented with assembler. </p>
Misha Brukman	8eb6719	2004-09-06 22:58:13 +0000	[diff] [blame]	2061	</div>
				2062
				2063	<!-- *********************************************************************** -->
				2064
				2065	<hr>
				2066	<address>
				2067	<a href="http://jigsaw.w3.org/css-validator/check/referer"><img
				2068	src="http://jigsaw.w3.org/css-validator/images/vcss" alt="Valid CSS!"></a>
				2069	<a href="http://validator.w3.org/check/referer"><img
				2070	src="http://www.w3.org/Icons/valid-html401" alt="Valid HTML 4.01!" /></a>
				2071
Chris Lattner	7897538	2008-11-11 19:30:41 +0000	[diff] [blame^]	2072	<a href="http://www.woo.com">Mason Woo</a> and <a href="http://misha.brukman.net">Misha Brukman</a><br>
Reid Spencer	05fe4b0	2006-03-14 05:39:39 +0000	[diff] [blame]	2073	<a href="http://llvm.org">The LLVM Compiler Infrastructure</a>
Misha Brukman	8eb6719	2004-09-06 22:58:13 +0000	[diff] [blame]	2074	<br>
				2075	Last modified: $Date$
				2076	</address>
				2077
				2078	</body>
				2079	</html>