Blame - docs/WritingAnLLVMBackend.rst - platform/external/llvm

blob: 868ca209a78c61793e2f8cb3252c9745093a6291 [file] [log] [blame]

Dmitri Gribenko	91cb694	2012-12-01 12:13:48 +0000	[diff] [blame]	1	================================
				2	Writing an LLVM Compiler Backend
				3	================================
				4
Sean Silva	5d0d67f	2012-12-31 11:49:51 +0000	[diff] [blame]	5	.. toctree::
				6	:hidden:
				7
				8	HowToUseInstrMappings
				9
Dmitri Gribenko	91cb694	2012-12-01 12:13:48 +0000	[diff] [blame]	10	.. sectionauthor:: Mason Woo <http://www.woo.com> and Misha Brukman <http://misha.brukman.net>
				11
				12	.. contents::
				13	:local:
				14
				15	Introduction
				16	============
				17
				18	This document describes techniques for writing compiler backends that convert
				19	the LLVM Intermediate Representation (IR) to code for a specified machine or
				20	other languages. Code intended for a specific machine can take the form of
				21	either assembly code or binary code (usable for a JIT compiler).
				22
				23	The backend of LLVM features a target-independent code generator that may
				24	create output for several types of target CPUs --- including X86, PowerPC,
				25	ARM, and SPARC. The backend may also be used to generate code targeted at SPUs
				26	of the Cell processor or GPUs to support the execution of compute kernels.
				27
				28	The document focuses on existing examples found in subdirectories of
				29	``llvm/lib/Target`` in a downloaded LLVM release. In particular, this document
				30	focuses on the example of creating a static compiler (one that emits text
				31	assembly) for a SPARC target, because SPARC has fairly standard
				32	characteristics, such as a RISC instruction set and straightforward calling
				33	conventions.
				34
				35	Audience
				36	--------
				37
				38	The audience for this document is anyone who needs to write an LLVM backend to
				39	generate code for a specific hardware or software target.
				40
				41	Prerequisite Reading
				42	--------------------
				43
				44	These essential documents must be read before reading this document:
				45
				46	* `LLVM Language Reference Manual <LangRef.html>`_ --- a reference manual for
				47	the LLVM assembly language.
				48
				49	* :doc:`CodeGenerator` --- a guide to the components (classes and code
				50	generation algorithms) for translating the LLVM internal representation into
				51	machine code for a specified target. Pay particular attention to the
				52	descriptions of code generation stages: Instruction Selection, Scheduling and
				53	Formation, SSA-based Optimization, Register Allocation, Prolog/Epilog Code
				54	Insertion, Late Machine Code Optimizations, and Code Emission.
				55
				56	* :doc:`TableGenFundamentals` --- a document that describes the TableGen
				57	(``tblgen``) application that manages domain-specific information to support
				58	LLVM code generation. TableGen processes input from a target description
				59	file (``.td`` suffix) and generates C++ code that can be used for code
				60	generation.
				61
Dmitri Gribenko	b64f020	2012-12-12 17:02:44 +0000	[diff] [blame]	62	* :doc:`WritingAnLLVMPass` --- The assembly printer is a ``FunctionPass``, as
				63	are several ``SelectionDAG`` processing steps.
Dmitri Gribenko	91cb694	2012-12-01 12:13:48 +0000	[diff] [blame]	64
				65	To follow the SPARC examples in this document, have a copy of `The SPARC
				66	Architecture Manual, Version 8 <http://www.sparc.org/standards/V8.pdf>`_ for
				67	reference. For details about the ARM instruction set, refer to the `ARM
				68	Architecture Reference Manual <http://infocenter.arm.com/>`_. For more about
				69	the GNU Assembler format (``GAS``), see `Using As
				70	<http://sourceware.org/binutils/docs/as/index.html>`_, especially for the
				71	assembly printer. "Using As" contains a list of target machine dependent
				72	features.
				73
				74	Basic Steps
				75	-----------
				76
				77	To write a compiler backend for LLVM that converts the LLVM IR to code for a
				78	specified target (machine or other language), follow these steps:
				79
				80	* Create a subclass of the ``TargetMachine`` class that describes
				81	characteristics of your target machine. Copy existing examples of specific
				82	``TargetMachine`` class and header files; for example, start with
				83	``SparcTargetMachine.cpp`` and ``SparcTargetMachine.h``, but change the file
				84	names for your target. Similarly, change code that references "``Sparc``" to
				85	reference your target.
				86
				87	* Describe the register set of the target. Use TableGen to generate code for
				88	register definition, register aliases, and register classes from a
				89	target-specific ``RegisterInfo.td`` input file. You should also write
				90	additional code for a subclass of the ``TargetRegisterInfo`` class that
				91	represents the class register file data used for register allocation and also
				92	describes the interactions between registers.
				93
				94	* Describe the instruction set of the target. Use TableGen to generate code
				95	for target-specific instructions from target-specific versions of
				96	``TargetInstrFormats.td`` and ``TargetInstrInfo.td``. You should write
				97	additional code for a subclass of the ``TargetInstrInfo`` class to represent
				98	machine instructions supported by the target machine.
				99
				100	* Describe the selection and conversion of the LLVM IR from a Directed Acyclic
				101	Graph (DAG) representation of instructions to native target-specific
				102	instructions. Use TableGen to generate code that matches patterns and
				103	selects instructions based on additional information in a target-specific
				104	version of ``TargetInstrInfo.td``. Write code for ``XXXISelDAGToDAG.cpp``,
				105	where ``XXX`` identifies the specific target, to perform pattern matching and
				106	DAG-to-DAG instruction selection. Also write code in ``XXXISelLowering.cpp``
				107	to replace or remove operations and data types that are not supported
				108	natively in a SelectionDAG.
				109
				110	* Write code for an assembly printer that converts LLVM IR to a GAS format for
				111	your target machine. You should add assembly strings to the instructions
				112	defined in your target-specific version of ``TargetInstrInfo.td``. You
				113	should also write code for a subclass of ``AsmPrinter`` that performs the
				114	LLVM-to-assembly conversion and a trivial subclass of ``TargetAsmInfo``.
				115
				116	* Optionally, add support for subtargets (i.e., variants with different
				117	capabilities). You should also write code for a subclass of the
				118	``TargetSubtarget`` class, which allows you to use the ``-mcpu=`` and
				119	``-mattr=`` command-line options.
				120
				121	* Optionally, add JIT support and create a machine code emitter (subclass of
				122	``TargetJITInfo``) that is used to emit binary code directly into memory.
				123
				124	In the ``.cpp`` and ``.h``. files, initially stub up these methods and then
				125	implement them later. Initially, you may not know which private members that
				126	the class will need and which components will need to be subclassed.
				127
				128	Preliminaries
				129	-------------
				130
				131	To actually create your compiler backend, you need to create and modify a few
				132	files. The absolute minimum is discussed here. But to actually use the LLVM
				133	target-independent code generator, you must perform the steps described in the
				134	:doc:`LLVM Target-Independent Code Generator <CodeGenerator>` document.
				135
				136	First, you should create a subdirectory under ``lib/Target`` to hold all the
				137	files related to your target. If your target is called "Dummy", create the
				138	directory ``lib/Target/Dummy``.
				139
				140	In this new directory, create a ``Makefile``. It is easiest to copy a
				141	``Makefile`` of another target and modify it. It should at least contain the
				142	``LEVEL``, ``LIBRARYNAME`` and ``TARGET`` variables, and then include
				143	``$(LEVEL)/Makefile.common``. The library can be named ``LLVMDummy`` (for
				144	example, see the MIPS target). Alternatively, you can split the library into
				145	``LLVMDummyCodeGen`` and ``LLVMDummyAsmPrinter``, the latter of which should be
				146	implemented in a subdirectory below ``lib/Target/Dummy`` (for example, see the
				147	PowerPC target).
				148
				149	Note that these two naming schemes are hardcoded into ``llvm-config``. Using
				150	any other naming scheme will confuse ``llvm-config`` and produce a lot of
				151	(seemingly unrelated) linker errors when linking ``llc``.
				152
				153	To make your target actually do something, you need to implement a subclass of
				154	``TargetMachine``. This implementation should typically be in the file
				155	``lib/Target/DummyTargetMachine.cpp``, but any file in the ``lib/Target``
				156	directory will be built and should work. To use LLVM's target independent code
				157	generator, you should do what all current machine backends do: create a
				158	subclass of ``LLVMTargetMachine``. (To create a target from scratch, create a
				159	subclass of ``TargetMachine``.)
				160
				161	To get LLVM to actually build and link your target, you need to add it to the
				162	``TARGETS_TO_BUILD`` variable. To do this, you modify the configure script to
				163	know about your target when parsing the ``--enable-targets`` option. Search
				164	the configure script for ``TARGETS_TO_BUILD``, add your target to the lists
				165	there (some creativity required), and then reconfigure. Alternatively, you can
				166	change ``autotools/configure.ac`` and regenerate configure by running
				167	``./autoconf/AutoRegen.sh``.
				168
				169	Target Machine
				170	==============
				171
				172	``LLVMTargetMachine`` is designed as a base class for targets implemented with
				173	the LLVM target-independent code generator. The ``LLVMTargetMachine`` class
				174	should be specialized by a concrete target class that implements the various
				175	virtual methods. ``LLVMTargetMachine`` is defined as a subclass of
				176	``TargetMachine`` in ``include/llvm/Target/TargetMachine.h``. The
				177	``TargetMachine`` class implementation (``TargetMachine.cpp``) also processes
				178	numerous command-line options.
				179
				180	To create a concrete target-specific subclass of ``LLVMTargetMachine``, start
				181	by copying an existing ``TargetMachine`` class and header. You should name the
				182	files that you create to reflect your specific target. For instance, for the
				183	SPARC target, name the files ``SparcTargetMachine.h`` and
				184	``SparcTargetMachine.cpp``.
				185
				186	For a target machine ``XXX``, the implementation of ``XXXTargetMachine`` must
				187	have access methods to obtain objects that represent target components. These
				188	methods are named ``get*Info``, and are intended to obtain the instruction set
				189	(``getInstrInfo``), register set (``getRegisterInfo``), stack frame layout
				190	(``getFrameInfo``), and similar information. ``XXXTargetMachine`` must also
				191	implement the ``getDataLayout`` method to access an object with target-specific
				192	data characteristics, such as data type size and alignment requirements.
				193
				194	For instance, for the SPARC target, the header file ``SparcTargetMachine.h``
				195	declares prototypes for several ``get*Info`` and ``getDataLayout`` methods that
				196	simply return a class member.
				197
				198	.. code-block:: c++
				199
				200	namespace llvm {
				201
				202	class Module;
				203
				204	class SparcTargetMachine : public LLVMTargetMachine {
				205	const DataLayout DataLayout; // Calculates type size & alignment
				206	SparcSubtarget Subtarget;
				207	SparcInstrInfo InstrInfo;
				208	TargetFrameInfo FrameInfo;
				209
				210	protected:
				211	virtual const TargetAsmInfo *createTargetAsmInfo() const;
				212
				213	public:
				214	SparcTargetMachine(const Module &M, const std::string &FS);
				215
				216	virtual const SparcInstrInfo *getInstrInfo() const {return &InstrInfo; }
				217	virtual const TargetFrameInfo *getFrameInfo() const {return &FrameInfo; }
				218	virtual const TargetSubtarget *getSubtargetImpl() const{return &Subtarget; }
				219	virtual const TargetRegisterInfo *getRegisterInfo() const {
				220	return &InstrInfo.getRegisterInfo();
				221	}
				222	virtual const DataLayout *getDataLayout() const { return &DataLayout; }
				223	static unsigned getModuleMatchQuality(const Module &M);
				224
				225	// Pass Pipeline Configuration
				226	virtual bool addInstSelector(PassManagerBase &PM, bool Fast);
				227	virtual bool addPreEmitPass(PassManagerBase &PM, bool Fast);
				228	};
				229
				230	} // end namespace llvm
				231
				232	* ``getInstrInfo()``
				233	* ``getRegisterInfo()``
				234	* ``getFrameInfo()``
				235	* ``getDataLayout()``
				236	* ``getSubtargetImpl()``
				237
				238	For some targets, you also need to support the following methods:
				239
				240	* ``getTargetLowering()``
				241	* ``getJITInfo()``
				242
				243	In addition, the ``XXXTargetMachine`` constructor should specify a
				244	``TargetDescription`` string that determines the data layout for the target
				245	machine, including characteristics such as pointer size, alignment, and
				246	endianness. For example, the constructor for ``SparcTargetMachine`` contains
				247	the following:
				248
				249	.. code-block:: c++
				250
				251	SparcTargetMachine::SparcTargetMachine(const Module &M, const std::string &FS)
				252	: DataLayout("E-p:32:32-f128:128:128"),
				253	Subtarget(M, FS), InstrInfo(Subtarget),
				254	FrameInfo(TargetFrameInfo::StackGrowsDown, 8, 0) {
				255	}
				256
				257	Hyphens separate portions of the ``TargetDescription`` string.
				258
				259	* An upper-case "``E``" in the string indicates a big-endian target data model.
				260	A lower-case "``e``" indicates little-endian.
				261
				262	* "``p:``" is followed by pointer information: size, ABI alignment, and
				263	preferred alignment. If only two figures follow "``p:``", then the first
				264	value is pointer size, and the second value is both ABI and preferred
				265	alignment.
				266
				267	* Then a letter for numeric type alignment: "``i``", "``f``", "``v``", or
				268	"``a``" (corresponding to integer, floating point, vector, or aggregate).
				269	"``i``", "``v``", or "``a``" are followed by ABI alignment and preferred
				270	alignment. "``f``" is followed by three values: the first indicates the size
				271	of a long double, then ABI alignment, and then ABI preferred alignment.
				272
				273	Target Registration
				274	===================
				275
				276	You must also register your target with the ``TargetRegistry``, which is what
				277	other LLVM tools use to be able to lookup and use your target at runtime. The
				278	``TargetRegistry`` can be used directly, but for most targets there are helper
				279	templates which should take care of the work for you.
				280
				281	All targets should declare a global ``Target`` object which is used to
				282	represent the target during registration. Then, in the target's ``TargetInfo``
				283	library, the target should define that object and use the ``RegisterTarget``
				284	template to register the target. For example, the Sparc registration code
				285	looks like this:
				286
				287	.. code-block:: c++
				288
				289	Target llvm::TheSparcTarget;
				290
				291	extern "C" void LLVMInitializeSparcTargetInfo() {
				292	RegisterTarget<Triple::sparc, /HasJIT=/false>
				293	X(TheSparcTarget, "sparc", "Sparc");
				294	}
				295
				296	This allows the ``TargetRegistry`` to look up the target by name or by target
				297	triple. In addition, most targets will also register additional features which
				298	are available in separate libraries. These registration steps are separate,
				299	because some clients may wish to only link in some parts of the target --- the
				300	JIT code generator does not require the use of the assembler printer, for
				301	example. Here is an example of registering the Sparc assembly printer:
				302
				303	.. code-block:: c++
				304
				305	extern "C" void LLVMInitializeSparcAsmPrinter() {
				306	RegisterAsmPrinter<SparcAsmPrinter> X(TheSparcTarget);
				307	}
				308
				309	For more information, see "`llvm/Target/TargetRegistry.h
				310	</doxygen/TargetRegistry_8h-source.html>`_".
				311
				312	Register Set and Register Classes
				313	=================================
				314
				315	You should describe a concrete target-specific class that represents the
				316	register file of a target machine. This class is called ``XXXRegisterInfo``
				317	(where ``XXX`` identifies the target) and represents the class register file
				318	data that is used for register allocation. It also describes the interactions
				319	between registers.
				320
				321	You also need to define register classes to categorize related registers. A
				322	register class should be added for groups of registers that are all treated the
				323	same way for some instruction. Typical examples are register classes for
				324	integer, floating-point, or vector registers. A register allocator allows an
				325	instruction to use any register in a specified register class to perform the
				326	instruction in a similar manner. Register classes allocate virtual registers
				327	to instructions from these sets, and register classes let the
				328	target-independent register allocator automatically choose the actual
				329	registers.
				330
				331	Much of the code for registers, including register definition, register
				332	aliases, and register classes, is generated by TableGen from
				333	``XXXRegisterInfo.td`` input files and placed in ``XXXGenRegisterInfo.h.inc``
				334	and ``XXXGenRegisterInfo.inc`` output files. Some of the code in the
				335	implementation of ``XXXRegisterInfo`` requires hand-coding.
				336
				337	Defining a Register
				338	-------------------
				339
				340	The ``XXXRegisterInfo.td`` file typically starts with register definitions for
				341	a target machine. The ``Register`` class (specified in ``Target.td``) is used
				342	to define an object for each register. The specified string ``n`` becomes the
				343	``Name`` of the register. The basic ``Register`` object does not have any
				344	subregisters and does not specify any aliases.
				345
				346	.. code-block:: llvm
				347
				348	class Register<string n> {
				349	string Namespace = "";
				350	string AsmName = n;
				351	string Name = n;
				352	int SpillSize = 0;
				353	int SpillAlignment = 0;
				354	list<Register> Aliases = [];
				355	list<Register> SubRegs = [];
				356	list<int> DwarfNumbers = [];
				357	}
				358
				359	For example, in the ``X86RegisterInfo.td`` file, there are register definitions
				360	that utilize the ``Register`` class, such as:
				361
				362	.. code-block:: llvm
				363
				364	def AL : Register<"AL">, DwarfRegNum<[0, 0, 0]>;
				365
				366	This defines the register ``AL`` and assigns it values (with ``DwarfRegNum``)
				367	that are used by ``gcc``, ``gdb``, or a debug information writer to identify a
				368	register. For register ``AL``, ``DwarfRegNum`` takes an array of 3 values
				369	representing 3 different modes: the first element is for X86-64, the second for
				370	exception handling (EH) on X86-32, and the third is generic. -1 is a special
				371	Dwarf number that indicates the gcc number is undefined, and -2 indicates the
				372	register number is invalid for this mode.
				373
				374	From the previously described line in the ``X86RegisterInfo.td`` file, TableGen
				375	generates this code in the ``X86GenRegisterInfo.inc`` file:
				376
				377	.. code-block:: c++
				378
				379	static const unsigned GR8[] = { X86::AL, ... };
				380
				381	const unsigned AL_AliasSet[] = { X86::AX, X86::EAX, X86::RAX, 0 };
				382
				383	const TargetRegisterDesc RegisterDescriptors[] = {
				384	...
				385	{ "AL", "AL", AL_AliasSet, Empty_SubRegsSet, Empty_SubRegsSet, AL_SuperRegsSet }, ...
				386
				387	From the register info file, TableGen generates a ``TargetRegisterDesc`` object
				388	for each register. ``TargetRegisterDesc`` is defined in
				389	``include/llvm/Target/TargetRegisterInfo.h`` with the following fields:
				390
				391	.. code-block:: c++
				392
				393	struct TargetRegisterDesc {
				394	const char *AsmName; // Assembly language name for the register
				395	const char *Name; // Printable name for the reg (for debugging)
				396	const unsigned *AliasSet; // Register Alias Set
				397	const unsigned *SubRegs; // Sub-register set
				398	const unsigned *ImmSubRegs; // Immediate sub-register set
				399	const unsigned *SuperRegs; // Super-register set
				400	};
				401
				402	TableGen uses the entire target description file (``.td``) to determine text
				403	names for the register (in the ``AsmName`` and ``Name`` fields of
				404	``TargetRegisterDesc``) and the relationships of other registers to the defined
				405	register (in the other ``TargetRegisterDesc`` fields). In this example, other
				406	definitions establish the registers "``AX``", "``EAX``", and "``RAX``" as
				407	aliases for one another, so TableGen generates a null-terminated array
				408	(``AL_AliasSet``) for this register alias set.
				409
				410	The ``Register`` class is commonly used as a base class for more complex
				411	classes. In ``Target.td``, the ``Register`` class is the base for the
				412	``RegisterWithSubRegs`` class that is used to define registers that need to
				413	specify subregisters in the ``SubRegs`` list, as shown here:
				414
				415	.. code-block:: llvm
				416
				417	class RegisterWithSubRegs<string n, list<Register> subregs> : Register<n> {
				418	let SubRegs = subregs;
				419	}
				420
				421	In ``SparcRegisterInfo.td``, additional register classes are defined for SPARC:
				422	a ``Register`` subclass, ``SparcReg``, and further subclasses: ``Ri``, ``Rf``,
				423	and ``Rd``. SPARC registers are identified by 5-bit ID numbers, which is a
				424	feature common to these subclasses. Note the use of "``let``" expressions to
				425	override values that are initially defined in a superclass (such as ``SubRegs``
				426	field in the ``Rd`` class).
				427
				428	.. code-block:: llvm
				429
				430	class SparcReg<string n> : Register<n> {
				431	field bits<5> Num;
				432	let Namespace = "SP";
				433	}
				434	// Ri - 32-bit integer registers
				435	class Ri<bits<5> num, string n> :
				436	SparcReg<n> {
				437	let Num = num;
				438	}
				439	// Rf - 32-bit floating-point registers
				440	class Rf<bits<5> num, string n> :
				441	SparcReg<n> {
				442	let Num = num;
				443	}
				444	// Rd - Slots in the FP register file for 64-bit floating-point values.
				445	class Rd<bits<5> num, string n, list<Register> subregs> : SparcReg<n> {
				446	let Num = num;
				447	let SubRegs = subregs;
				448	}
				449
				450	In the ``SparcRegisterInfo.td`` file, there are register definitions that
				451	utilize these subclasses of ``Register``, such as:
				452
				453	.. code-block:: llvm
				454
				455	def G0 : Ri< 0, "G0">, DwarfRegNum<[0]>;
				456	def G1 : Ri< 1, "G1">, DwarfRegNum<[1]>;
				457	...
				458	def F0 : Rf< 0, "F0">, DwarfRegNum<[32]>;
				459	def F1 : Rf< 1, "F1">, DwarfRegNum<[33]>;
				460	...
				461	def D0 : Rd< 0, "F0", [F0, F1]>, DwarfRegNum<[32]>;
				462	def D1 : Rd< 2, "F2", [F2, F3]>, DwarfRegNum<[34]>;
				463
				464	The last two registers shown above (``D0`` and ``D1``) are double-precision
				465	floating-point registers that are aliases for pairs of single-precision
				466	floating-point sub-registers. In addition to aliases, the sub-register and
				467	super-register relationships of the defined register are in fields of a
				468	register's ``TargetRegisterDesc``.
				469
				470	Defining a Register Class
				471	-------------------------
				472
				473	The ``RegisterClass`` class (specified in ``Target.td``) is used to define an
				474	object that represents a group of related registers and also defines the
				475	default allocation order of the registers. A target description file
				476	``XXXRegisterInfo.td`` that uses ``Target.td`` can construct register classes
				477	using the following class:
				478
				479	.. code-block:: llvm
				480
				481	class RegisterClass<string namespace,
				482	list<ValueType> regTypes, int alignment, dag regList> {
				483	string Namespace = namespace;
				484	list<ValueType> RegTypes = regTypes;
				485	int Size = 0; // spill size, in bits; zero lets tblgen pick the size
				486	int Alignment = alignment;
				487
				488	// CopyCost is the cost of copying a value between two registers
				489	// default value 1 means a single instruction
				490	// A negative value means copying is extremely expensive or impossible
				491	int CopyCost = 1;
				492	dag MemberList = regList;
				493
				494	// for register classes that are subregisters of this class
				495	list<RegisterClass> SubRegClassList = [];
				496
				497	code MethodProtos = [{}]; // to insert arbitrary code
				498	code MethodBodies = [{}];
				499	}
				500
				501	To define a ``RegisterClass``, use the following 4 arguments:
				502
				503	* The first argument of the definition is the name of the namespace.
				504
				505	* The second argument is a list of ``ValueType`` register type values that are
				506	defined in ``include/llvm/CodeGen/ValueTypes.td``. Defined values include
				507	integer types (such as ``i16``, ``i32``, and ``i1`` for Boolean),
				508	floating-point types (``f32``, ``f64``), and vector types (for example,
				509	``v8i16`` for an ``8 x i16`` vector). All registers in a ``RegisterClass``
				510	must have the same ``ValueType``, but some registers may store vector data in
				511	different configurations. For example a register that can process a 128-bit
				512	vector may be able to handle 16 8-bit integer elements, 8 16-bit integers, 4
				513	32-bit integers, and so on.
				514
				515	* The third argument of the ``RegisterClass`` definition specifies the
				516	alignment required of the registers when they are stored or loaded to
				517	memory.
				518
				519	* The final argument, ``regList``, specifies which registers are in this class.
				520	If an alternative allocation order method is not specified, then ``regList``
				521	also defines the order of allocation used by the register allocator. Besides
				522	simply listing registers with ``(add R0, R1, ...)``, more advanced set
				523	operators are available. See ``include/llvm/Target/Target.td`` for more
				524	information.
				525
				526	In ``SparcRegisterInfo.td``, three ``RegisterClass`` objects are defined:
				527	``FPRegs``, ``DFPRegs``, and ``IntRegs``. For all three register classes, the
				528	first argument defines the namespace with the string "``SP``". ``FPRegs``
				529	defines a group of 32 single-precision floating-point registers (``F0`` to
				530	``F31``); ``DFPRegs`` defines a group of 16 double-precision registers
				531	(``D0-D15``).
				532
				533	.. code-block:: llvm
				534
				535	// F0, F1, F2, ..., F31
				536	def FPRegs : RegisterClass<"SP", [f32], 32, (sequence "F%u", 0, 31)>;
				537
				538	def DFPRegs : RegisterClass<"SP", [f64], 64,
				539	(add D0, D1, D2, D3, D4, D5, D6, D7, D8,
				540	D9, D10, D11, D12, D13, D14, D15)>;
				541
				542	def IntRegs : RegisterClass<"SP", [i32], 32,
				543	(add L0, L1, L2, L3, L4, L5, L6, L7,
				544	I0, I1, I2, I3, I4, I5,
				545	O0, O1, O2, O3, O4, O5, O7,
				546	G1,
				547	// Non-allocatable regs:
				548	G2, G3, G4,
				549	O6, // stack ptr
				550	I6, // frame ptr
				551	I7, // return address
				552	G0, // constant zero
				553	G5, G6, G7 // reserved for kernel
				554	)>;
				555
				556	Using ``SparcRegisterInfo.td`` with TableGen generates several output files
				557	that are intended for inclusion in other source code that you write.
				558	``SparcRegisterInfo.td`` generates ``SparcGenRegisterInfo.h.inc``, which should
				559	be included in the header file for the implementation of the SPARC register
				560	implementation that you write (``SparcRegisterInfo.h``). In
				561	``SparcGenRegisterInfo.h.inc`` a new structure is defined called
				562	``SparcGenRegisterInfo`` that uses ``TargetRegisterInfo`` as its base. It also
				563	specifies types, based upon the defined register classes: ``DFPRegsClass``,
				564	``FPRegsClass``, and ``IntRegsClass``.
				565
				566	``SparcRegisterInfo.td`` also generates ``SparcGenRegisterInfo.inc``, which is
				567	included at the bottom of ``SparcRegisterInfo.cpp``, the SPARC register
				568	implementation. The code below shows only the generated integer registers and
				569	associated register classes. The order of registers in ``IntRegs`` reflects
				570	the order in the definition of ``IntRegs`` in the target description file.
				571
				572	.. code-block:: c++
				573
				574	// IntRegs Register Class...
				575	static const unsigned IntRegs[] = {
				576	SP::L0, SP::L1, SP::L2, SP::L3, SP::L4, SP::L5,
				577	SP::L6, SP::L7, SP::I0, SP::I1, SP::I2, SP::I3,
				578	SP::I4, SP::I5, SP::O0, SP::O1, SP::O2, SP::O3,
				579	SP::O4, SP::O5, SP::O7, SP::G1, SP::G2, SP::G3,
				580	SP::G4, SP::O6, SP::I6, SP::I7, SP::G0, SP::G5,
				581	SP::G6, SP::G7,
				582	};
				583
				584	// IntRegsVTs Register Class Value Types...
				585	static const MVT::ValueType IntRegsVTs[] = {
				586	MVT::i32, MVT::Other
				587	};
				588
				589	namespace SP { // Register class instances
				590	DFPRegsClass DFPRegsRegClass;
				591	FPRegsClass FPRegsRegClass;
				592	IntRegsClass IntRegsRegClass;
				593	...
				594	// IntRegs Sub-register Classess...
				595	static const TargetRegisterClass* const IntRegsSubRegClasses [] = {
				596	NULL
				597	};
				598	...
				599	// IntRegs Super-register Classess...
				600	static const TargetRegisterClass* const IntRegsSuperRegClasses [] = {
				601	NULL
				602	};
				603	...
				604	// IntRegs Register Class sub-classes...
				605	static const TargetRegisterClass* const IntRegsSubclasses [] = {
				606	NULL
				607	};
				608	...
				609	// IntRegs Register Class super-classes...
				610	static const TargetRegisterClass* const IntRegsSuperclasses [] = {
				611	NULL
				612	};
				613
				614	IntRegsClass::IntRegsClass() : TargetRegisterClass(IntRegsRegClassID,
				615	IntRegsVTs, IntRegsSubclasses, IntRegsSuperclasses, IntRegsSubRegClasses,
				616	IntRegsSuperRegClasses, 4, 4, 1, IntRegs, IntRegs + 32) {}
				617	}
				618
				619	The register allocators will avoid using reserved registers, and callee saved
				620	registers are not used until all the volatile registers have been used. That
				621	is usually good enough, but in some cases it may be necessary to provide custom
				622	allocation orders.
				623
				624	Implement a subclass of ``TargetRegisterInfo``
				625	----------------------------------------------
				626
				627	The final step is to hand code portions of ``XXXRegisterInfo``, which
				628	implements the interface described in ``TargetRegisterInfo.h`` (see
				629	:ref:`TargetRegisterInfo`). These functions return ``0``, ``NULL``, or
				630	``false``, unless overridden. Here is a list of functions that are overridden
				631	for the SPARC implementation in ``SparcRegisterInfo.cpp``:
				632
				633	* ``getCalleeSavedRegs`` --- Returns a list of callee-saved registers in the
				634	order of the desired callee-save stack frame offset.
				635
				636	* ``getReservedRegs`` --- Returns a bitset indexed by physical register
				637	numbers, indicating if a particular register is unavailable.
				638
				639	* ``hasFP`` --- Return a Boolean indicating if a function should have a
				640	dedicated frame pointer register.
				641
				642	* ``eliminateCallFramePseudoInstr`` --- If call frame setup or destroy pseudo
				643	instructions are used, this can be called to eliminate them.
				644
				645	* ``eliminateFrameIndex`` --- Eliminate abstract frame indices from
				646	instructions that may use them.
				647
				648	* ``emitPrologue`` --- Insert prologue code into the function.
				649
				650	* ``emitEpilogue`` --- Insert epilogue code into the function.
				651
				652	.. _instruction-set:
				653
				654	Instruction Set
				655	===============
				656
				657	During the early stages of code generation, the LLVM IR code is converted to a
				658	``SelectionDAG`` with nodes that are instances of the ``SDNode`` class
				659	containing target instructions. An ``SDNode`` has an opcode, operands, type
				660	requirements, and operation properties. For example, is an operation
				661	commutative, does an operation load from memory. The various operation node
				662	types are described in the ``include/llvm/CodeGen/SelectionDAGNodes.h`` file
				663	(values of the ``NodeType`` enum in the ``ISD`` namespace).
				664
				665	TableGen uses the following target description (``.td``) input files to
				666	generate much of the code for instruction definition:
				667
				668	* ``Target.td`` --- Where the ``Instruction``, ``Operand``, ``InstrInfo``, and
				669	other fundamental classes are defined.
				670
				671	* ``TargetSelectionDAG.td`` --- Used by ``SelectionDAG`` instruction selection
				672	generators, contains ``SDTC*`` classes (selection DAG type constraint),
				673	definitions of ``SelectionDAG`` nodes (such as ``imm``, ``cond``, ``bb``,
				674	``add``, ``fadd``, ``sub``), and pattern support (``Pattern``, ``Pat``,
				675	``PatFrag``, ``PatLeaf``, ``ComplexPattern``.
				676
				677	* ``XXXInstrFormats.td`` --- Patterns for definitions of target-specific
				678	instructions.
				679
				680	* ``XXXInstrInfo.td`` --- Target-specific definitions of instruction templates,
				681	condition codes, and instructions of an instruction set. For architecture
				682	modifications, a different file name may be used. For example, for Pentium
				683	with SSE instruction, this file is ``X86InstrSSE.td``, and for Pentium with
				684	MMX, this file is ``X86InstrMMX.td``.
				685
				686	There is also a target-specific ``XXX.td`` file, where ``XXX`` is the name of
				687	the target. The ``XXX.td`` file includes the other ``.td`` input files, but
				688	its contents are only directly important for subtargets.
				689
				690	You should describe a concrete target-specific class ``XXXInstrInfo`` that
				691	represents machine instructions supported by a target machine.
				692	``XXXInstrInfo`` contains an array of ``XXXInstrDescriptor`` objects, each of
				693	which describes one instruction. An instruction descriptor defines:
				694
				695	* Opcode mnemonic
				696	* Number of operands
				697	* List of implicit register definitions and uses
				698	* Target-independent properties (such as memory access, is commutable)
				699	* Target-specific flags
				700
				701	The Instruction class (defined in ``Target.td``) is mostly used as a base for
				702	more complex instruction classes.
				703
				704	.. code-block:: llvm
				705
				706	class Instruction {
				707	string Namespace = "";
				708	dag OutOperandList; // A dag containing the MI def operand list.
				709	dag InOperandList; // A dag containing the MI use operand list.
				710	string AsmString = ""; // The .s format to print the instruction with.
				711	list<dag> Pattern; // Set to the DAG pattern for this instruction.
				712	list<Register> Uses = [];
				713	list<Register> Defs = [];
				714	list<Predicate> Predicates = []; // predicates turned into isel match code
				715	... remainder not shown for space ...
				716	}
				717
				718	A ``SelectionDAG`` node (``SDNode``) should contain an object representing a
				719	target-specific instruction that is defined in ``XXXInstrInfo.td``. The
				720	instruction objects should represent instructions from the architecture manual
				721	of the target machine (such as the SPARC Architecture Manual for the SPARC
				722	target).
				723
				724	A single instruction from the architecture manual is often modeled as multiple
				725	target instructions, depending upon its operands. For example, a manual might
				726	describe an add instruction that takes a register or an immediate operand. An
				727	LLVM target could model this with two instructions named ``ADDri`` and
				728	``ADDrr``.
				729
				730	You should define a class for each instruction category and define each opcode
				731	as a subclass of the category with appropriate parameters such as the fixed
				732	binary encoding of opcodes and extended opcodes. You should map the register
				733	bits to the bits of the instruction in which they are encoded (for the JIT).
				734	Also you should specify how the instruction should be printed when the
				735	automatic assembly printer is used.
				736
				737	As is described in the SPARC Architecture Manual, Version 8, there are three
				738	major 32-bit formats for instructions. Format 1 is only for the ``CALL``
				739	instruction. Format 2 is for branch on condition codes and ``SETHI`` (set high
				740	bits of a register) instructions. Format 3 is for other instructions.
				741
				742	Each of these formats has corresponding classes in ``SparcInstrFormat.td``.
				743	``InstSP`` is a base class for other instruction classes. Additional base
				744	classes are specified for more precise formats: for example in
				745	``SparcInstrFormat.td``, ``F2_1`` is for ``SETHI``, and ``F2_2`` is for
				746	branches. There are three other base classes: ``F3_1`` for register/register
				747	operations, ``F3_2`` for register/immediate operations, and ``F3_3`` for
				748	floating-point operations. ``SparcInstrInfo.td`` also adds the base class
				749	``Pseudo`` for synthetic SPARC instructions.
				750
				751	``SparcInstrInfo.td`` largely consists of operand and instruction definitions
				752	for the SPARC target. In ``SparcInstrInfo.td``, the following target
				753	description file entry, ``LDrr``, defines the Load Integer instruction for a
				754	Word (the ``LD`` SPARC opcode) from a memory address to a register. The first
				755	parameter, the value 3 (``11``\ :sub:`2`), is the operation value for this
				756	category of operation. The second parameter (``000000``\ :sub:`2`) is the
				757	specific operation value for ``LD``/Load Word. The third parameter is the
				758	output destination, which is a register operand and defined in the ``Register``
				759	target description file (``IntRegs``).
				760
				761	.. code-block:: llvm
				762
				763	def LDrr : F3_1 <3, 0b000000, (outs IntRegs:$dst), (ins MEMrr:$addr),
				764	"ld [$addr], $dst",
				765	[(set IntRegs:$dst, (load ADDRrr:$addr))]>;
				766
				767	The fourth parameter is the input source, which uses the address operand
				768	``MEMrr`` that is defined earlier in ``SparcInstrInfo.td``:
				769
				770	.. code-block:: llvm
				771
				772	def MEMrr : Operand<i32> {
				773	let PrintMethod = "printMemOperand";
				774	let MIOperandInfo = (ops IntRegs, IntRegs);
				775	}
				776
				777	The fifth parameter is a string that is used by the assembly printer and can be
				778	left as an empty string until the assembly printer interface is implemented.
				779	The sixth and final parameter is the pattern used to match the instruction
				780	during the SelectionDAG Select Phase described in :doc:`CodeGenerator`.
				781	This parameter is detailed in the next section, :ref:`instruction-selector`.
				782
				783	Instruction class definitions are not overloaded for different operand types,
				784	so separate versions of instructions are needed for register, memory, or
				785	immediate value operands. For example, to perform a Load Integer instruction
				786	for a Word from an immediate operand to a register, the following instruction
				787	class is defined:
				788
				789	.. code-block:: llvm
				790
				791	def LDri : F3_2 <3, 0b000000, (outs IntRegs:$dst), (ins MEMri:$addr),
				792	"ld [$addr], $dst",
				793	[(set IntRegs:$dst, (load ADDRri:$addr))]>;
				794
				795	Writing these definitions for so many similar instructions can involve a lot of
				796	cut and paste. In ``.td`` files, the ``multiclass`` directive enables the
				797	creation of templates to define several instruction classes at once (using the
				798	``defm`` directive). For example in ``SparcInstrInfo.td``, the ``multiclass``
				799	pattern ``F3_12`` is defined to create 2 instruction classes each time
				800	``F3_12`` is invoked:
				801
				802	.. code-block:: llvm
				803
				804	multiclass F3_12 <string OpcStr, bits<6> Op3Val, SDNode OpNode> {
				805	def rr : F3_1 <2, Op3Val,
				806	(outs IntRegs:$dst), (ins IntRegs:$b, IntRegs:$c),
				807	!strconcat(OpcStr, " $b, $c, $dst"),
				808	[(set IntRegs:$dst, (OpNode IntRegs:$b, IntRegs:$c))]>;
				809	def ri : F3_2 <2, Op3Val,
				810	(outs IntRegs:$dst), (ins IntRegs:$b, i32imm:$c),
				811	!strconcat(OpcStr, " $b, $c, $dst"),
				812	[(set IntRegs:$dst, (OpNode IntRegs:$b, simm13:$c))]>;
				813	}
				814
				815	So when the ``defm`` directive is used for the ``XOR`` and ``ADD``
				816	instructions, as seen below, it creates four instruction objects: ``XORrr``,
				817	``XORri``, ``ADDrr``, and ``ADDri``.
				818
				819	.. code-block:: llvm
				820
				821	defm XOR : F3_12<"xor", 0b000011, xor>;
				822	defm ADD : F3_12<"add", 0b000000, add>;
				823
				824	``SparcInstrInfo.td`` also includes definitions for condition codes that are
				825	referenced by branch instructions. The following definitions in
				826	``SparcInstrInfo.td`` indicate the bit location of the SPARC condition code.
				827	For example, the 10\ :sup:`th` bit represents the "greater than" condition for
				828	integers, and the 22\ :sup:`nd` bit represents the "greater than" condition for
				829	floats.
				830
				831	.. code-block:: llvm
				832
				833	def ICC_NE : ICC_VAL< 9>; // Not Equal
				834	def ICC_E : ICC_VAL< 1>; // Equal
				835	def ICC_G : ICC_VAL<10>; // Greater
				836	...
				837	def FCC_U : FCC_VAL<23>; // Unordered
				838	def FCC_G : FCC_VAL<22>; // Greater
				839	def FCC_UG : FCC_VAL<21>; // Unordered or Greater
				840	...
				841
				842	(Note that ``Sparc.h`` also defines enums that correspond to the same SPARC
				843	condition codes. Care must be taken to ensure the values in ``Sparc.h``
				844	correspond to the values in ``SparcInstrInfo.td``. I.e., ``SPCC::ICC_NE = 9``,
				845	``SPCC::FCC_U = 23`` and so on.)
				846
				847	Instruction Operand Mapping
				848	---------------------------
				849
				850	The code generator backend maps instruction operands to fields in the
				851	instruction. Operands are assigned to unbound fields in the instruction in the
				852	order they are defined. Fields are bound when they are assigned a value. For
				853	example, the Sparc target defines the ``XNORrr`` instruction as a ``F3_1``
				854	format instruction having three operands.
				855
				856	.. code-block:: llvm
				857
				858	def XNORrr : F3_1<2, 0b000111,
				859	(outs IntRegs:$dst), (ins IntRegs:$b, IntRegs:$c),
				860	"xnor $b, $c, $dst",
				861	[(set IntRegs:$dst, (not (xor IntRegs:$b, IntRegs:$c)))]>;
				862
				863	The instruction templates in ``SparcInstrFormats.td`` show the base class for
				864	``F3_1`` is ``InstSP``.
				865
				866	.. code-block:: llvm
				867
				868	class InstSP<dag outs, dag ins, string asmstr, list<dag> pattern> : Instruction {
				869	field bits<32> Inst;
				870	let Namespace = "SP";
				871	bits<2> op;
				872	let Inst{31-30} = op;
				873	dag OutOperandList = outs;
				874	dag InOperandList = ins;
				875	let AsmString = asmstr;
				876	let Pattern = pattern;
				877	}
				878
				879	``InstSP`` leaves the ``op`` field unbound.
				880
				881	.. code-block:: llvm
				882
				883	class F3<dag outs, dag ins, string asmstr, list<dag> pattern>
				884	: InstSP<outs, ins, asmstr, pattern> {
				885	bits<5> rd;
				886	bits<6> op3;
				887	bits<5> rs1;
				888	let op{1} = 1; // Op = 2 or 3
				889	let Inst{29-25} = rd;
				890	let Inst{24-19} = op3;
				891	let Inst{18-14} = rs1;
				892	}
				893
				894	``F3`` binds the ``op`` field and defines the ``rd``, ``op3``, and ``rs1``
				895	fields. ``F3`` format instructions will bind the operands ``rd``, ``op3``, and
				896	``rs1`` fields.
				897
				898	.. code-block:: llvm
				899
				900	class F3_1<bits<2> opVal, bits<6> op3val, dag outs, dag ins,
				901	string asmstr, list<dag> pattern> : F3<outs, ins, asmstr, pattern> {
				902	bits<8> asi = 0; // asi not currently used
				903	bits<5> rs2;
				904	let op = opVal;
				905	let op3 = op3val;
				906	let Inst{13} = 0; // i field = 0
				907	let Inst{12-5} = asi; // address space identifier
				908	let Inst{4-0} = rs2;
				909	}
				910
				911	``F3_1`` binds the ``op3`` field and defines the ``rs2`` fields. ``F3_1``
				912	format instructions will bind the operands to the ``rd``, ``rs1``, and ``rs2``
				913	fields. This results in the ``XNORrr`` instruction binding ``$dst``, ``$b``,
				914	and ``$c`` operands to the ``rd``, ``rs1``, and ``rs2`` fields respectively.
				915
				916	Instruction Relation Mapping
				917	----------------------------
				918
				919	This TableGen feature is used to relate instructions with each other. It is
				920	particularly useful when you have multiple instruction formats and need to
				921	switch between them after instruction selection. This entire feature is driven
				922	by relation models which can be defined in ``XXXInstrInfo.td`` files
				923	according to the target-specific instruction set. Relation models are defined
				924	using ``InstrMapping`` class as a base. TableGen parses all the models
				925	and generates instruction relation maps using the specified information.
				926	Relation maps are emitted as tables in the ``XXXGenInstrInfo.inc`` file
				927	along with the functions to query them. For the detailed information on how to
				928	use this feature, please refer to :doc:`HowToUseInstrMappings`.
				929
				930	Implement a subclass of ``TargetInstrInfo``
				931	-------------------------------------------
				932
				933	The final step is to hand code portions of ``XXXInstrInfo``, which implements
				934	the interface described in ``TargetInstrInfo.h`` (see :ref:`TargetInstrInfo`).
				935	These functions return ``0`` or a Boolean or they assert, unless overridden.
				936	Here's a list of functions that are overridden for the SPARC implementation in
				937	``SparcInstrInfo.cpp``:
				938
				939	* ``isLoadFromStackSlot`` --- If the specified machine instruction is a direct
				940	load from a stack slot, return the register number of the destination and the
				941	``FrameIndex`` of the stack slot.
				942
				943	* ``isStoreToStackSlot`` --- If the specified machine instruction is a direct
				944	store to a stack slot, return the register number of the destination and the
				945	``FrameIndex`` of the stack slot.
				946
				947	* ``copyPhysReg`` --- Copy values between a pair of physical registers.
				948
				949	* ``storeRegToStackSlot`` --- Store a register value to a stack slot.
				950
				951	* ``loadRegFromStackSlot`` --- Load a register value from a stack slot.
				952
				953	* ``storeRegToAddr`` --- Store a register value to memory.
				954
				955	* ``loadRegFromAddr`` --- Load a register value from memory.
				956
				957	* ``foldMemoryOperand`` --- Attempt to combine instructions of any load or
				958	store instruction for the specified operand(s).
				959
				960	Branch Folding and If Conversion
				961	--------------------------------
				962
				963	Performance can be improved by combining instructions or by eliminating
				964	instructions that are never reached. The ``AnalyzeBranch`` method in
				965	``XXXInstrInfo`` may be implemented to examine conditional instructions and
				966	remove unnecessary instructions. ``AnalyzeBranch`` looks at the end of a
				967	machine basic block (MBB) for opportunities for improvement, such as branch
				968	folding and if conversion. The ``BranchFolder`` and ``IfConverter`` machine
				969	function passes (see the source files ``BranchFolding.cpp`` and
				970	``IfConversion.cpp`` in the ``lib/CodeGen`` directory) call ``AnalyzeBranch``
				971	to improve the control flow graph that represents the instructions.
				972
				973	Several implementations of ``AnalyzeBranch`` (for ARM, Alpha, and X86) can be
				974	examined as models for your own ``AnalyzeBranch`` implementation. Since SPARC
				975	does not implement a useful ``AnalyzeBranch``, the ARM target implementation is
				976	shown below.
				977
				978	``AnalyzeBranch`` returns a Boolean value and takes four parameters:
				979
				980	* ``MachineBasicBlock &MBB`` --- The incoming block to be examined.
				981
				982	* ``MachineBasicBlock *&TBB`` --- A destination block that is returned. For a
				983	conditional branch that evaluates to true, ``TBB`` is the destination.
				984
				985	* ``MachineBasicBlock *&FBB`` --- For a conditional branch that evaluates to
				986	false, ``FBB`` is returned as the destination.
				987
				988	* ``std::vector<MachineOperand> &Cond`` --- List of operands to evaluate a
				989	condition for a conditional branch.
				990
				991	In the simplest case, if a block ends without a branch, then it falls through
				992	to the successor block. No destination blocks are specified for either ``TBB``
				993	or ``FBB``, so both parameters return ``NULL``. The start of the
				994	``AnalyzeBranch`` (see code below for the ARM target) shows the function
				995	parameters and the code for the simplest case.
				996
				997	.. code-block:: c++
				998
				999	bool ARMInstrInfo::AnalyzeBranch(MachineBasicBlock &MBB,
				1000	MachineBasicBlock *&TBB,
				1001	MachineBasicBlock *&FBB,
				1002	std::vector<MachineOperand> &Cond) const
				1003	{
				1004	MachineBasicBlock::iterator I = MBB.end();
				1005	if (I == MBB.begin() \|\| !isUnpredicatedTerminator(--I))
				1006	return false;
				1007
				1008	If a block ends with a single unconditional branch instruction, then
				1009	``AnalyzeBranch`` (shown below) should return the destination of that branch in
				1010	the ``TBB`` parameter.
				1011
				1012	.. code-block:: c++
				1013
				1014	if (LastOpc == ARM::B \|\| LastOpc == ARM::tB) {
				1015	TBB = LastInst->getOperand(0).getMBB();
				1016	return false;
				1017	}
				1018
				1019	If a block ends with two unconditional branches, then the second branch is
				1020	never reached. In that situation, as shown below, remove the last branch
				1021	instruction and return the penultimate branch in the ``TBB`` parameter.
				1022
				1023	.. code-block:: c++
				1024
				1025	if ((SecondLastOpc == ARM::B \|\| SecondLastOpc == ARM::tB) &&
				1026	(LastOpc == ARM::B \|\| LastOpc == ARM::tB)) {
				1027	TBB = SecondLastInst->getOperand(0).getMBB();
				1028	I = LastInst;
				1029	I->eraseFromParent();
				1030	return false;
				1031	}
				1032
				1033	A block may end with a single conditional branch instruction that falls through
				1034	to successor block if the condition evaluates to false. In that case,
				1035	``AnalyzeBranch`` (shown below) should return the destination of that
				1036	conditional branch in the ``TBB`` parameter and a list of operands in the
				1037	``Cond`` parameter to evaluate the condition.
				1038
				1039	.. code-block:: c++
				1040
				1041	if (LastOpc == ARM::Bcc \|\| LastOpc == ARM::tBcc) {
				1042	// Block ends with fall-through condbranch.
				1043	TBB = LastInst->getOperand(0).getMBB();
				1044	Cond.push_back(LastInst->getOperand(1));
				1045	Cond.push_back(LastInst->getOperand(2));
				1046	return false;
				1047	}
				1048
				1049	If a block ends with both a conditional branch and an ensuing unconditional
				1050	branch, then ``AnalyzeBranch`` (shown below) should return the conditional
				1051	branch destination (assuming it corresponds to a conditional evaluation of
				1052	"``true``") in the ``TBB`` parameter and the unconditional branch destination
				1053	in the ``FBB`` (corresponding to a conditional evaluation of "``false``"). A
				1054	list of operands to evaluate the condition should be returned in the ``Cond``
				1055	parameter.
				1056
				1057	.. code-block:: c++
				1058
				1059	unsigned SecondLastOpc = SecondLastInst->getOpcode();
				1060
				1061	if ((SecondLastOpc == ARM::Bcc && LastOpc == ARM::B) \|\|
				1062	(SecondLastOpc == ARM::tBcc && LastOpc == ARM::tB)) {
				1063	TBB = SecondLastInst->getOperand(0).getMBB();
				1064	Cond.push_back(SecondLastInst->getOperand(1));
				1065	Cond.push_back(SecondLastInst->getOperand(2));
				1066	FBB = LastInst->getOperand(0).getMBB();
				1067	return false;
				1068	}
				1069
				1070	For the last two cases (ending with a single conditional branch or ending with
				1071	one conditional and one unconditional branch), the operands returned in the
				1072	``Cond`` parameter can be passed to methods of other instructions to create new
				1073	branches or perform other operations. An implementation of ``AnalyzeBranch``
				1074	requires the helper methods ``RemoveBranch`` and ``InsertBranch`` to manage
				1075	subsequent operations.
				1076
				1077	``AnalyzeBranch`` should return false indicating success in most circumstances.
				1078	``AnalyzeBranch`` should only return true when the method is stumped about what
				1079	to do, for example, if a block has three terminating branches.
				1080	``AnalyzeBranch`` may return true if it encounters a terminator it cannot
				1081	handle, such as an indirect branch.
				1082
				1083	.. _instruction-selector:
				1084
				1085	Instruction Selector
				1086	====================
				1087
				1088	LLVM uses a ``SelectionDAG`` to represent LLVM IR instructions, and nodes of
				1089	the ``SelectionDAG`` ideally represent native target instructions. During code
				1090	generation, instruction selection passes are performed to convert non-native
				1091	DAG instructions into native target-specific instructions. The pass described
				1092	in ``XXXISelDAGToDAG.cpp`` is used to match patterns and perform DAG-to-DAG
				1093	instruction selection. Optionally, a pass may be defined (in
				1094	``XXXBranchSelector.cpp``) to perform similar DAG-to-DAG operations for branch
				1095	instructions. Later, the code in ``XXXISelLowering.cpp`` replaces or removes
				1096	operations and data types not supported natively (legalizes) in a
				1097	``SelectionDAG``.
				1098
				1099	TableGen generates code for instruction selection using the following target
				1100	description input files:
				1101
				1102	* ``XXXInstrInfo.td`` --- Contains definitions of instructions in a
				1103	target-specific instruction set, generates ``XXXGenDAGISel.inc``, which is
				1104	included in ``XXXISelDAGToDAG.cpp``.
				1105
				1106	* ``XXXCallingConv.td`` --- Contains the calling and return value conventions
				1107	for the target architecture, and it generates ``XXXGenCallingConv.inc``,
				1108	which is included in ``XXXISelLowering.cpp``.
				1109
				1110	The implementation of an instruction selection pass must include a header that
				1111	declares the ``FunctionPass`` class or a subclass of ``FunctionPass``. In
				1112	``XXXTargetMachine.cpp``, a Pass Manager (PM) should add each instruction
				1113	selection pass into the queue of passes to run.
				1114
				1115	The LLVM static compiler (``llc``) is an excellent tool for visualizing the
				1116	contents of DAGs. To display the ``SelectionDAG`` before or after specific
				1117	processing phases, use the command line options for ``llc``, described at
				1118	:ref:`SelectionDAG-Process`.
				1119
				1120	To describe instruction selector behavior, you should add patterns for lowering
				1121	LLVM code into a ``SelectionDAG`` as the last parameter of the instruction
				1122	definitions in ``XXXInstrInfo.td``. For example, in ``SparcInstrInfo.td``,
				1123	this entry defines a register store operation, and the last parameter describes
				1124	a pattern with the store DAG operator.
				1125
				1126	.. code-block:: llvm
				1127
				1128	def STrr : F3_1< 3, 0b000100, (outs), (ins MEMrr:$addr, IntRegs:$src),
				1129	"st $src, [$addr]", [(store IntRegs:$src, ADDRrr:$addr)]>;
				1130
				1131	``ADDRrr`` is a memory mode that is also defined in ``SparcInstrInfo.td``:
				1132
				1133	.. code-block:: llvm
				1134
				1135	def ADDRrr : ComplexPattern<i32, 2, "SelectADDRrr", [], []>;
				1136
				1137	The definition of ``ADDRrr`` refers to ``SelectADDRrr``, which is a function
				1138	defined in an implementation of the Instructor Selector (such as
				1139	``SparcISelDAGToDAG.cpp``).
				1140
				1141	In ``lib/Target/TargetSelectionDAG.td``, the DAG operator for store is defined
				1142	below:
				1143
				1144	.. code-block:: llvm
				1145
				1146	def store : PatFrag<(ops node:$val, node:$ptr),
				1147	(st node:$val, node:$ptr), [{
				1148	if (StoreSDNode *ST = dyn_cast<StoreSDNode>(N))
				1149	return !ST->isTruncatingStore() &&
				1150	ST->getAddressingMode() == ISD::UNINDEXED;
				1151	return false;
				1152	}]>;
				1153
				1154	``XXXInstrInfo.td`` also generates (in ``XXXGenDAGISel.inc``) the
				1155	``SelectCode`` method that is used to call the appropriate processing method
				1156	for an instruction. In this example, ``SelectCode`` calls ``Select_ISD_STORE``
				1157	for the ``ISD::STORE`` opcode.
				1158
				1159	.. code-block:: c++
				1160
				1161	SDNode *SelectCode(SDValue N) {
				1162	...
				1163	MVT::ValueType NVT = N.getNode()->getValueType(0);
				1164	switch (N.getOpcode()) {
				1165	case ISD::STORE: {
				1166	switch (NVT) {
				1167	default:
				1168	return Select_ISD_STORE(N);
				1169	break;
				1170	}
				1171	break;
				1172	}
				1173	...
				1174
				1175	The pattern for ``STrr`` is matched, so elsewhere in ``XXXGenDAGISel.inc``,
				1176	code for ``STrr`` is created for ``Select_ISD_STORE``. The ``Emit_22`` method
				1177	is also generated in ``XXXGenDAGISel.inc`` to complete the processing of this
				1178	instruction.
				1179
				1180	.. code-block:: c++
				1181
				1182	SDNode *Select_ISD_STORE(const SDValue &N) {
				1183	SDValue Chain = N.getOperand(0);
				1184	if (Predicate_store(N.getNode())) {
				1185	SDValue N1 = N.getOperand(1);
				1186	SDValue N2 = N.getOperand(2);
				1187	SDValue CPTmp0;
				1188	SDValue CPTmp1;
				1189
				1190	// Pattern: (st:void IntRegs:i32:$src,
				1191	// ADDRrr:i32:$addr)<<P:Predicate_store>>
				1192	// Emits: (STrr:void ADDRrr:i32:$addr, IntRegs:i32:$src)
				1193	// Pattern complexity = 13 cost = 1 size = 0
				1194	if (SelectADDRrr(N, N2, CPTmp0, CPTmp1) &&
				1195	N1.getNode()->getValueType(0) == MVT::i32 &&
				1196	N2.getNode()->getValueType(0) == MVT::i32) {
				1197	return Emit_22(N, SP::STrr, CPTmp0, CPTmp1);
				1198	}
				1199	...
				1200
				1201	The SelectionDAG Legalize Phase
				1202	-------------------------------
				1203
				1204	The Legalize phase converts a DAG to use types and operations that are natively
				1205	supported by the target. For natively unsupported types and operations, you
				1206	need to add code to the target-specific ``XXXTargetLowering`` implementation to
				1207	convert unsupported types and operations to supported ones.
				1208
				1209	In the constructor for the ``XXXTargetLowering`` class, first use the
				1210	``addRegisterClass`` method to specify which types are supported and which
				1211	register classes are associated with them. The code for the register classes
				1212	are generated by TableGen from ``XXXRegisterInfo.td`` and placed in
				1213	``XXXGenRegisterInfo.h.inc``. For example, the implementation of the
				1214	constructor for the SparcTargetLowering class (in ``SparcISelLowering.cpp``)
				1215	starts with the following code:
				1216
				1217	.. code-block:: c++
				1218
				1219	addRegisterClass(MVT::i32, SP::IntRegsRegisterClass);
				1220	addRegisterClass(MVT::f32, SP::FPRegsRegisterClass);
				1221	addRegisterClass(MVT::f64, SP::DFPRegsRegisterClass);
				1222
				1223	You should examine the node types in the ``ISD`` namespace
				1224	(``include/llvm/CodeGen/SelectionDAGNodes.h``) and determine which operations
				1225	the target natively supports. For operations that do not have native
				1226	support, add a callback to the constructor for the ``XXXTargetLowering`` class,
				1227	so the instruction selection process knows what to do. The ``TargetLowering``
				1228	class callback methods (declared in ``llvm/Target/TargetLowering.h``) are:
				1229
				1230	* ``setOperationAction`` --- General operation.
				1231	* ``setLoadExtAction`` --- Load with extension.
				1232	* ``setTruncStoreAction`` --- Truncating store.
				1233	* ``setIndexedLoadAction`` --- Indexed load.
				1234	* ``setIndexedStoreAction`` --- Indexed store.
				1235	* ``setConvertAction`` --- Type conversion.
				1236	* ``setCondCodeAction`` --- Support for a given condition code.
				1237
				1238	Note: on older releases, ``setLoadXAction`` is used instead of
				1239	``setLoadExtAction``. Also, on older releases, ``setCondCodeAction`` may not
				1240	be supported. Examine your release to see what methods are specifically
				1241	supported.
				1242
				1243	These callbacks are used to determine that an operation does or does not work
				1244	with a specified type (or types). And in all cases, the third parameter is a
				1245	``LegalAction`` type enum value: ``Promote``, ``Expand``, ``Custom``, or
				1246	``Legal``. ``SparcISelLowering.cpp`` contains examples of all four
				1247	``LegalAction`` values.
				1248
				1249	Promote
				1250	^^^^^^^
				1251
				1252	For an operation without native support for a given type, the specified type
				1253	may be promoted to a larger type that is supported. For example, SPARC does
				1254	not support a sign-extending load for Boolean values (``i1`` type), so in
				1255	``SparcISelLowering.cpp`` the third parameter below, ``Promote``, changes
				1256	``i1`` type values to a large type before loading.
				1257
				1258	.. code-block:: c++
				1259
				1260	setLoadExtAction(ISD::SEXTLOAD, MVT::i1, Promote);
				1261
				1262	Expand
				1263	^^^^^^
				1264
				1265	For a type without native support, a value may need to be broken down further,
				1266	rather than promoted. For an operation without native support, a combination
				1267	of other operations may be used to similar effect. In SPARC, the
				1268	floating-point sine and cosine trig operations are supported by expansion to
				1269	other operations, as indicated by the third parameter, ``Expand``, to
				1270	``setOperationAction``:
				1271
				1272	.. code-block:: c++
				1273
				1274	setOperationAction(ISD::FSIN, MVT::f32, Expand);
				1275	setOperationAction(ISD::FCOS, MVT::f32, Expand);
				1276
				1277	Custom
				1278	^^^^^^
				1279
				1280	For some operations, simple type promotion or operation expansion may be
				1281	insufficient. In some cases, a special intrinsic function must be implemented.
				1282
				1283	For example, a constant value may require special treatment, or an operation
				1284	may require spilling and restoring registers in the stack and working with
				1285	register allocators.
				1286
				1287	As seen in ``SparcISelLowering.cpp`` code below, to perform a type conversion
				1288	from a floating point value to a signed integer, first the
				1289	``setOperationAction`` should be called with ``Custom`` as the third parameter:
				1290
				1291	.. code-block:: c++
				1292
				1293	setOperationAction(ISD::FP_TO_SINT, MVT::i32, Custom);
				1294
				1295	In the ``LowerOperation`` method, for each ``Custom`` operation, a case
				1296	statement should be added to indicate what function to call. In the following
				1297	code, an ``FP_TO_SINT`` opcode will call the ``LowerFP_TO_SINT`` method:
				1298
				1299	.. code-block:: c++
				1300
				1301	SDValue SparcTargetLowering::LowerOperation(SDValue Op, SelectionDAG &DAG) {
				1302	switch (Op.getOpcode()) {
				1303	case ISD::FP_TO_SINT: return LowerFP_TO_SINT(Op, DAG);
				1304	...
				1305	}
				1306	}
				1307
				1308	Finally, the ``LowerFP_TO_SINT`` method is implemented, using an FP register to
				1309	convert the floating-point value to an integer.
				1310
				1311	.. code-block:: c++
				1312
				1313	static SDValue LowerFP_TO_SINT(SDValue Op, SelectionDAG &DAG) {
				1314	assert(Op.getValueType() == MVT::i32);
				1315	Op = DAG.getNode(SPISD::FTOI, MVT::f32, Op.getOperand(0));
				1316	return DAG.getNode(ISD::BITCAST, MVT::i32, Op);
				1317	}
				1318
				1319	Legal
				1320	^^^^^
				1321
				1322	The ``Legal`` ``LegalizeAction`` enum value simply indicates that an operation
				1323	is natively supported. ``Legal`` represents the default condition, so it
				1324	is rarely used. In ``SparcISelLowering.cpp``, the action for ``CTPOP`` (an
				1325	operation to count the bits set in an integer) is natively supported only for
				1326	SPARC v9. The following code enables the ``Expand`` conversion technique for
				1327	non-v9 SPARC implementations.
				1328
				1329	.. code-block:: c++
				1330
				1331	setOperationAction(ISD::CTPOP, MVT::i32, Expand);
				1332	...
				1333	if (TM.getSubtarget<SparcSubtarget>().isV9())
				1334	setOperationAction(ISD::CTPOP, MVT::i32, Legal);
				1335
				1336	Calling Conventions
				1337	-------------------
				1338
				1339	To support target-specific calling conventions, ``XXXGenCallingConv.td`` uses
				1340	interfaces (such as ``CCIfType`` and ``CCAssignToReg``) that are defined in
				1341	``lib/Target/TargetCallingConv.td``. TableGen can take the target descriptor
				1342	file ``XXXGenCallingConv.td`` and generate the header file
				1343	``XXXGenCallingConv.inc``, which is typically included in
				1344	``XXXISelLowering.cpp``. You can use the interfaces in
				1345	``TargetCallingConv.td`` to specify:
				1346
				1347	* The order of parameter allocation.
				1348
				1349	* Where parameters and return values are placed (that is, on the stack or in
				1350	registers).
				1351
				1352	* Which registers may be used.
				1353
				1354	* Whether the caller or callee unwinds the stack.
				1355
				1356	The following example demonstrates the use of the ``CCIfType`` and
				1357	``CCAssignToReg`` interfaces. If the ``CCIfType`` predicate is true (that is,
				1358	if the current argument is of type ``f32`` or ``f64``), then the action is
				1359	performed. In this case, the ``CCAssignToReg`` action assigns the argument
				1360	value to the first available register: either ``R0`` or ``R1``.
				1361
				1362	.. code-block:: llvm
				1363
				1364	CCIfType<[f32,f64], CCAssignToReg<[R0, R1]>>
				1365
				1366	``SparcCallingConv.td`` contains definitions for a target-specific return-value
				1367	calling convention (``RetCC_Sparc32``) and a basic 32-bit C calling convention
				1368	(``CC_Sparc32``). The definition of ``RetCC_Sparc32`` (shown below) indicates
				1369	which registers are used for specified scalar return types. A single-precision
				1370	float is returned to register ``F0``, and a double-precision float goes to
				1371	register ``D0``. A 32-bit integer is returned in register ``I0`` or ``I1``.
				1372
				1373	.. code-block:: llvm
				1374
				1375	def RetCC_Sparc32 : CallingConv<[
				1376	CCIfType<[i32], CCAssignToReg<[I0, I1]>>,
				1377	CCIfType<[f32], CCAssignToReg<[F0]>>,
				1378	CCIfType<[f64], CCAssignToReg<[D0]>>
				1379	]>;
				1380
				1381	The definition of ``CC_Sparc32`` in ``SparcCallingConv.td`` introduces
				1382	``CCAssignToStack``, which assigns the value to a stack slot with the specified
				1383	size and alignment. In the example below, the first parameter, 4, indicates
				1384	the size of the slot, and the second parameter, also 4, indicates the stack
				1385	alignment along 4-byte units. (Special cases: if size is zero, then the ABI
				1386	size is used; if alignment is zero, then the ABI alignment is used.)
				1387
				1388	.. code-block:: llvm
				1389
				1390	def CC_Sparc32 : CallingConv<[
				1391	// All arguments get passed in integer registers if there is space.
				1392	CCIfType<[i32, f32, f64], CCAssignToReg<[I0, I1, I2, I3, I4, I5]>>,
				1393	CCAssignToStack<4, 4>
				1394	]>;
				1395
				1396	``CCDelegateTo`` is another commonly used interface, which tries to find a
				1397	specified sub-calling convention, and, if a match is found, it is invoked. In
				1398	the following example (in ``X86CallingConv.td``), the definition of
				1399	``RetCC_X86_32_C`` ends with ``CCDelegateTo``. After the current value is
				1400	assigned to the register ``ST0`` or ``ST1``, the ``RetCC_X86Common`` is
				1401	invoked.
				1402
				1403	.. code-block:: llvm
				1404
				1405	def RetCC_X86_32_C : CallingConv<[
				1406	CCIfType<[f32], CCAssignToReg<[ST0, ST1]>>,
				1407	CCIfType<[f64], CCAssignToReg<[ST0, ST1]>>,
				1408	CCDelegateTo<RetCC_X86Common>
				1409	]>;
				1410
				1411	``CCIfCC`` is an interface that attempts to match the given name to the current
				1412	calling convention. If the name identifies the current calling convention,
				1413	then a specified action is invoked. In the following example (in
				1414	``X86CallingConv.td``), if the ``Fast`` calling convention is in use, then
				1415	``RetCC_X86_32_Fast`` is invoked. If the ``SSECall`` calling convention is in
				1416	use, then ``RetCC_X86_32_SSE`` is invoked.
				1417
				1418	.. code-block:: llvm
				1419
				1420	def RetCC_X86_32 : CallingConv<[
				1421	CCIfCC<"CallingConv::Fast", CCDelegateTo<RetCC_X86_32_Fast>>,
				1422	CCIfCC<"CallingConv::X86_SSECall", CCDelegateTo<RetCC_X86_32_SSE>>,
				1423	CCDelegateTo<RetCC_X86_32_C>
				1424	]>;
				1425
				1426	Other calling convention interfaces include:
				1427
				1428	* ``CCIf <predicate, action>`` --- If the predicate matches, apply the action.
				1429
				1430	* ``CCIfInReg <action>`` --- If the argument is marked with the "``inreg``"
				1431	attribute, then apply the action.
				1432
				1433	* ``CCIfNest <action>`` --- If the argument is marked with the "``nest``"
				1434	attribute, then apply the action.
				1435
				1436	* ``CCIfNotVarArg <action>`` --- If the current function does not take a
				1437	variable number of arguments, apply the action.
				1438
				1439	* ``CCAssignToRegWithShadow <registerList, shadowList>`` --- similar to
				1440	``CCAssignToReg``, but with a shadow list of registers.
				1441
				1442	* ``CCPassByVal <size, align>`` --- Assign value to a stack slot with the
				1443	minimum specified size and alignment.
				1444
				1445	* ``CCPromoteToType <type>`` --- Promote the current value to the specified
				1446	type.
				1447
				1448	* ``CallingConv <[actions]>`` --- Define each calling convention that is
				1449	supported.
				1450
				1451	Assembly Printer
				1452	================
				1453
				1454	During the code emission stage, the code generator may utilize an LLVM pass to
				1455	produce assembly output. To do this, you want to implement the code for a
				1456	printer that converts LLVM IR to a GAS-format assembly language for your target
				1457	machine, using the following steps:
				1458
				1459	* Define all the assembly strings for your target, adding them to the
				1460	instructions defined in the ``XXXInstrInfo.td`` file. (See
				1461	:ref:`instruction-set`.) TableGen will produce an output file
				1462	(``XXXGenAsmWriter.inc``) with an implementation of the ``printInstruction``
				1463	method for the ``XXXAsmPrinter`` class.
				1464
				1465	* Write ``XXXTargetAsmInfo.h``, which contains the bare-bones declaration of
				1466	the ``XXXTargetAsmInfo`` class (a subclass of ``TargetAsmInfo``).
				1467
				1468	* Write ``XXXTargetAsmInfo.cpp``, which contains target-specific values for
				1469	``TargetAsmInfo`` properties and sometimes new implementations for methods.
				1470
				1471	* Write ``XXXAsmPrinter.cpp``, which implements the ``AsmPrinter`` class that
				1472	performs the LLVM-to-assembly conversion.
				1473
				1474	The code in ``XXXTargetAsmInfo.h`` is usually a trivial declaration of the
				1475	``XXXTargetAsmInfo`` class for use in ``XXXTargetAsmInfo.cpp``. Similarly,
				1476	``XXXTargetAsmInfo.cpp`` usually has a few declarations of ``XXXTargetAsmInfo``
				1477	replacement values that override the default values in ``TargetAsmInfo.cpp``.
				1478	For example in ``SparcTargetAsmInfo.cpp``:
				1479
				1480	.. code-block:: c++
				1481
				1482	SparcTargetAsmInfo::SparcTargetAsmInfo(const SparcTargetMachine &TM) {
				1483	Data16bitsDirective = "\t.half\t";
				1484	Data32bitsDirective = "\t.word\t";
				1485	Data64bitsDirective = 0; // .xword is only supported by V9.
				1486	ZeroDirective = "\t.skip\t";
				1487	CommentString = "!";
				1488	ConstantPoolSection = "\t.section \".rodata\",#alloc\n";
				1489	}
				1490
				1491	The X86 assembly printer implementation (``X86TargetAsmInfo``) is an example
				1492	where the target specific ``TargetAsmInfo`` class uses an overridden methods:
				1493	``ExpandInlineAsm``.
				1494
				1495	A target-specific implementation of ``AsmPrinter`` is written in
				1496	``XXXAsmPrinter.cpp``, which implements the ``AsmPrinter`` class that converts
				1497	the LLVM to printable assembly. The implementation must include the following
				1498	headers that have declarations for the ``AsmPrinter`` and
				1499	``MachineFunctionPass`` classes. The ``MachineFunctionPass`` is a subclass of
				1500	``FunctionPass``.
				1501
				1502	.. code-block:: c++
				1503
				1504	#include "llvm/CodeGen/AsmPrinter.h"
				1505	#include "llvm/CodeGen/MachineFunctionPass.h"
				1506
				1507	As a ``FunctionPass``, ``AsmPrinter`` first calls ``doInitialization`` to set
				1508	up the ``AsmPrinter``. In ``SparcAsmPrinter``, a ``Mangler`` object is
				1509	instantiated to process variable names.
				1510
				1511	In ``XXXAsmPrinter.cpp``, the ``runOnMachineFunction`` method (declared in
				1512	``MachineFunctionPass``) must be implemented for ``XXXAsmPrinter``. In
				1513	``MachineFunctionPass``, the ``runOnFunction`` method invokes
				1514	``runOnMachineFunction``. Target-specific implementations of
				1515	``runOnMachineFunction`` differ, but generally do the following to process each
				1516	machine function:
				1517
				1518	* Call ``SetupMachineFunction`` to perform initialization.
				1519
				1520	* Call ``EmitConstantPool`` to print out (to the output stream) constants which
				1521	have been spilled to memory.
				1522
				1523	* Call ``EmitJumpTableInfo`` to print out jump tables used by the current
				1524	function.
				1525
				1526	* Print out the label for the current function.
				1527
				1528	* Print out the code for the function, including basic block labels and the
				1529	assembly for the instruction (using ``printInstruction``)
				1530
				1531	The ``XXXAsmPrinter`` implementation must also include the code generated by
				1532	TableGen that is output in the ``XXXGenAsmWriter.inc`` file. The code in
				1533	``XXXGenAsmWriter.inc`` contains an implementation of the ``printInstruction``
				1534	method that may call these methods:
				1535
				1536	* ``printOperand``
				1537	* ``printMemOperand``
				1538	* ``printCCOperand`` (for conditional statements)
				1539	* ``printDataDirective``
				1540	* ``printDeclare``
				1541	* ``printImplicitDef``
				1542	* ``printInlineAsm``
				1543
				1544	The implementations of ``printDeclare``, ``printImplicitDef``,
				1545	``printInlineAsm``, and ``printLabel`` in ``AsmPrinter.cpp`` are generally
				1546	adequate for printing assembly and do not need to be overridden.
				1547
				1548	The ``printOperand`` method is implemented with a long ``switch``/``case``
				1549	statement for the type of operand: register, immediate, basic block, external
				1550	symbol, global address, constant pool index, or jump table index. For an
				1551	instruction with a memory address operand, the ``printMemOperand`` method
				1552	should be implemented to generate the proper output. Similarly,
				1553	``printCCOperand`` should be used to print a conditional operand.
				1554
				1555	``doFinalization`` should be overridden in ``XXXAsmPrinter``, and it should be
				1556	called to shut down the assembly printer. During ``doFinalization``, global
				1557	variables and constants are printed to output.
				1558
				1559	Subtarget Support
				1560	=================
				1561
				1562	Subtarget support is used to inform the code generation process of instruction
				1563	set variations for a given chip set. For example, the LLVM SPARC
				1564	implementation provided covers three major versions of the SPARC microprocessor
				1565	architecture: Version 8 (V8, which is a 32-bit architecture), Version 9 (V9, a
				1566	64-bit architecture), and the UltraSPARC architecture. V8 has 16
				1567	double-precision floating-point registers that are also usable as either 32
				1568	single-precision or 8 quad-precision registers. V8 is also purely big-endian.
				1569	V9 has 32 double-precision floating-point registers that are also usable as 16
				1570	quad-precision registers, but cannot be used as single-precision registers.
				1571	The UltraSPARC architecture combines V9 with UltraSPARC Visual Instruction Set
				1572	extensions.
				1573
				1574	If subtarget support is needed, you should implement a target-specific
				1575	``XXXSubtarget`` class for your architecture. This class should process the
				1576	command-line options ``-mcpu=`` and ``-mattr=``.
				1577
				1578	TableGen uses definitions in the ``Target.td`` and ``Sparc.td`` files to
				1579	generate code in ``SparcGenSubtarget.inc``. In ``Target.td``, shown below, the
				1580	``SubtargetFeature`` interface is defined. The first 4 string parameters of
				1581	the ``SubtargetFeature`` interface are a feature name, an attribute set by the
				1582	feature, the value of the attribute, and a description of the feature. (The
				1583	fifth parameter is a list of features whose presence is implied, and its
				1584	default value is an empty array.)
				1585
				1586	.. code-block:: llvm
				1587
				1588	class SubtargetFeature<string n, string a, string v, string d,
				1589	list<SubtargetFeature> i = []> {
				1590	string Name = n;
				1591	string Attribute = a;
				1592	string Value = v;
				1593	string Desc = d;
				1594	list<SubtargetFeature> Implies = i;
				1595	}
				1596
				1597	In the ``Sparc.td`` file, the ``SubtargetFeature`` is used to define the
				1598	following features.
				1599
				1600	.. code-block:: llvm
				1601
				1602	def FeatureV9 : SubtargetFeature<"v9", "IsV9", "true",
				1603	"Enable SPARC-V9 instructions">;
				1604	def FeatureV8Deprecated : SubtargetFeature<"deprecated-v8",
				1605	"V8DeprecatedInsts", "true",
				1606	"Enable deprecated V8 instructions in V9 mode">;
				1607	def FeatureVIS : SubtargetFeature<"vis", "IsVIS", "true",
				1608	"Enable UltraSPARC Visual Instruction Set extensions">;
				1609
				1610	Elsewhere in ``Sparc.td``, the ``Proc`` class is defined and then is used to
				1611	define particular SPARC processor subtypes that may have the previously
				1612	described features.
				1613
				1614	.. code-block:: llvm
				1615
				1616	class Proc<string Name, list<SubtargetFeature> Features>
				1617	: Processor<Name, NoItineraries, Features>;
				1618
				1619	def : Proc<"generic", []>;
				1620	def : Proc<"v8", []>;
				1621	def : Proc<"supersparc", []>;
				1622	def : Proc<"sparclite", []>;
				1623	def : Proc<"f934", []>;
				1624	def : Proc<"hypersparc", []>;
				1625	def : Proc<"sparclite86x", []>;
				1626	def : Proc<"sparclet", []>;
				1627	def : Proc<"tsc701", []>;
				1628	def : Proc<"v9", [FeatureV9]>;
				1629	def : Proc<"ultrasparc", [FeatureV9, FeatureV8Deprecated]>;
				1630	def : Proc<"ultrasparc3", [FeatureV9, FeatureV8Deprecated]>;
				1631	def : Proc<"ultrasparc3-vis", [FeatureV9, FeatureV8Deprecated, FeatureVIS]>;
				1632
				1633	From ``Target.td`` and ``Sparc.td`` files, the resulting
				1634	``SparcGenSubtarget.inc`` specifies enum values to identify the features,
				1635	arrays of constants to represent the CPU features and CPU subtypes, and the
				1636	``ParseSubtargetFeatures`` method that parses the features string that sets
				1637	specified subtarget options. The generated ``SparcGenSubtarget.inc`` file
				1638	should be included in the ``SparcSubtarget.cpp``. The target-specific
				1639	implementation of the ``XXXSubtarget`` method should follow this pseudocode:
				1640
				1641	.. code-block:: c++
				1642
				1643	XXXSubtarget::XXXSubtarget(const Module &M, const std::string &FS) {
				1644	// Set the default features
				1645	// Determine default and user specified characteristics of the CPU
				1646	// Call ParseSubtargetFeatures(FS, CPU) to parse the features string
				1647	// Perform any additional operations
				1648	}
				1649
				1650	JIT Support
				1651	===========
				1652
				1653	The implementation of a target machine optionally includes a Just-In-Time (JIT)
				1654	code generator that emits machine code and auxiliary structures as binary
				1655	output that can be written directly to memory. To do this, implement JIT code
				1656	generation by performing the following steps:
				1657
				1658	* Write an ``XXXCodeEmitter.cpp`` file that contains a machine function pass
				1659	that transforms target-machine instructions into relocatable machine
				1660	code.
				1661
				1662	* Write an ``XXXJITInfo.cpp`` file that implements the JIT interfaces for
				1663	target-specific code-generation activities, such as emitting machine code and
				1664	stubs.
				1665
				1666	* Modify ``XXXTargetMachine`` so that it provides a ``TargetJITInfo`` object
				1667	through its ``getJITInfo`` method.
				1668
				1669	There are several different approaches to writing the JIT support code. For
				1670	instance, TableGen and target descriptor files may be used for creating a JIT
				1671	code generator, but are not mandatory. For the Alpha and PowerPC target
				1672	machines, TableGen is used to generate ``XXXGenCodeEmitter.inc``, which
				1673	contains the binary coding of machine instructions and the
				1674	``getBinaryCodeForInstr`` method to access those codes. Other JIT
				1675	implementations do not.
				1676
				1677	Both ``XXXJITInfo.cpp`` and ``XXXCodeEmitter.cpp`` must include the
				1678	``llvm/CodeGen/MachineCodeEmitter.h`` header file that defines the
				1679	``MachineCodeEmitter`` class containing code for several callback functions
				1680	that write data (in bytes, words, strings, etc.) to the output stream.
				1681
				1682	Machine Code Emitter
				1683	--------------------
				1684
				1685	In ``XXXCodeEmitter.cpp``, a target-specific of the ``Emitter`` class is
				1686	implemented as a function pass (subclass of ``MachineFunctionPass``). The
				1687	target-specific implementation of ``runOnMachineFunction`` (invoked by
				1688	``runOnFunction`` in ``MachineFunctionPass``) iterates through the
				1689	``MachineBasicBlock`` calls ``emitInstruction`` to process each instruction and
				1690	emit binary code. ``emitInstruction`` is largely implemented with case
				1691	statements on the instruction types defined in ``XXXInstrInfo.h``. For
				1692	example, in ``X86CodeEmitter.cpp``, the ``emitInstruction`` method is built
				1693	around the following ``switch``/``case`` statements:
				1694
				1695	.. code-block:: c++
				1696
				1697	switch (Desc->TSFlags & X86::FormMask) {
				1698	case X86II::Pseudo: // for not yet implemented instructions
				1699	... // or pseudo-instructions
				1700	break;
				1701	case X86II::RawFrm: // for instructions with a fixed opcode value
				1702	...
				1703	break;
				1704	case X86II::AddRegFrm: // for instructions that have one register operand
				1705	... // added to their opcode
				1706	break;
				1707	case X86II::MRMDestReg:// for instructions that use the Mod/RM byte
				1708	... // to specify a destination (register)
				1709	break;
				1710	case X86II::MRMDestMem:// for instructions that use the Mod/RM byte
				1711	... // to specify a destination (memory)
				1712	break;
				1713	case X86II::MRMSrcReg: // for instructions that use the Mod/RM byte
				1714	... // to specify a source (register)
				1715	break;
				1716	case X86II::MRMSrcMem: // for instructions that use the Mod/RM byte
				1717	... // to specify a source (memory)
				1718	break;
				1719	case X86II::MRM0r: case X86II::MRM1r: // for instructions that operate on
				1720	case X86II::MRM2r: case X86II::MRM3r: // a REGISTER r/m operand and
				1721	case X86II::MRM4r: case X86II::MRM5r: // use the Mod/RM byte and a field
				1722	case X86II::MRM6r: case X86II::MRM7r: // to hold extended opcode data
				1723	...
				1724	break;
				1725	case X86II::MRM0m: case X86II::MRM1m: // for instructions that operate on
				1726	case X86II::MRM2m: case X86II::MRM3m: // a MEMORY r/m operand and
				1727	case X86II::MRM4m: case X86II::MRM5m: // use the Mod/RM byte and a field
				1728	case X86II::MRM6m: case X86II::MRM7m: // to hold extended opcode data
				1729	...
				1730	break;
				1731	case X86II::MRMInitReg: // for instructions whose source and
				1732	... // destination are the same register
				1733	break;
				1734	}
				1735
				1736	The implementations of these case statements often first emit the opcode and
				1737	then get the operand(s). Then depending upon the operand, helper methods may
				1738	be called to process the operand(s). For example, in ``X86CodeEmitter.cpp``,
				1739	for the ``X86II::AddRegFrm`` case, the first data emitted (by ``emitByte``) is
				1740	the opcode added to the register operand. Then an object representing the
				1741	machine operand, ``MO1``, is extracted. The helper methods such as
				1742	``isImmediate``, ``isGlobalAddress``, ``isExternalSymbol``,
				1743	``isConstantPoolIndex``, and ``isJumpTableIndex`` determine the operand type.
				1744	(``X86CodeEmitter.cpp`` also has private methods such as ``emitConstant``,
				1745	``emitGlobalAddress``, ``emitExternalSymbolAddress``, ``emitConstPoolAddress``,
				1746	and ``emitJumpTableAddress`` that emit the data into the output stream.)
				1747
				1748	.. code-block:: c++
				1749
				1750	case X86II::AddRegFrm:
				1751	MCE.emitByte(BaseOpcode + getX86RegNum(MI.getOperand(CurOp++).getReg()));
				1752
				1753	if (CurOp != NumOps) {
				1754	const MachineOperand &MO1 = MI.getOperand(CurOp++);
				1755	unsigned Size = X86InstrInfo::sizeOfImm(Desc);
				1756	if (MO1.isImmediate())
				1757	emitConstant(MO1.getImm(), Size);
				1758	else {
				1759	unsigned rt = Is64BitMode ? X86::reloc_pcrel_word
				1760	: (IsPIC ? X86::reloc_picrel_word : X86::reloc_absolute_word);
				1761	if (Opcode == X86::MOV64ri)
				1762	rt = X86::reloc_absolute_dword; // FIXME: add X86II flag?
				1763	if (MO1.isGlobalAddress()) {
				1764	bool NeedStub = isa<Function>(MO1.getGlobal());
				1765	bool isLazy = gvNeedsLazyPtr(MO1.getGlobal());
				1766	emitGlobalAddress(MO1.getGlobal(), rt, MO1.getOffset(), 0,
				1767	NeedStub, isLazy);
				1768	} else if (MO1.isExternalSymbol())
				1769	emitExternalSymbolAddress(MO1.getSymbolName(), rt);
				1770	else if (MO1.isConstantPoolIndex())
				1771	emitConstPoolAddress(MO1.getIndex(), rt);
				1772	else if (MO1.isJumpTableIndex())
				1773	emitJumpTableAddress(MO1.getIndex(), rt);
				1774	}
				1775	}
				1776	break;
				1777
				1778	In the previous example, ``XXXCodeEmitter.cpp`` uses the variable ``rt``, which
				1779	is a ``RelocationType`` enum that may be used to relocate addresses (for
				1780	example, a global address with a PIC base offset). The ``RelocationType`` enum
				1781	for that target is defined in the short target-specific ``XXXRelocations.h``
				1782	file. The ``RelocationType`` is used by the ``relocate`` method defined in
				1783	``XXXJITInfo.cpp`` to rewrite addresses for referenced global symbols.
				1784
				1785	For example, ``X86Relocations.h`` specifies the following relocation types for
				1786	the X86 addresses. In all four cases, the relocated value is added to the
				1787	value already in memory. For ``reloc_pcrel_word`` and ``reloc_picrel_word``,
				1788	there is an additional initial adjustment.
				1789
				1790	.. code-block:: c++
				1791
				1792	enum RelocationType {
				1793	reloc_pcrel_word = 0, // add reloc value after adjusting for the PC loc
				1794	reloc_picrel_word = 1, // add reloc value after adjusting for the PIC base
				1795	reloc_absolute_word = 2, // absolute relocation; no additional adjustment
				1796	reloc_absolute_dword = 3 // absolute relocation; no additional adjustment
				1797	};
				1798
				1799	Target JIT Info
				1800	---------------
				1801
				1802	``XXXJITInfo.cpp`` implements the JIT interfaces for target-specific
				1803	code-generation activities, such as emitting machine code and stubs. At
				1804	minimum, a target-specific version of ``XXXJITInfo`` implements the following:
				1805
				1806	* ``getLazyResolverFunction`` --- Initializes the JIT, gives the target a
				1807	function that is used for compilation.
				1808
				1809	* ``emitFunctionStub`` --- Returns a native function with a specified address
				1810	for a callback function.
				1811
				1812	* ``relocate`` --- Changes the addresses of referenced globals, based on
				1813	relocation types.
				1814
				1815	* Callback function that are wrappers to a function stub that is used when the
				1816	real target is not initially known.
				1817
				1818	``getLazyResolverFunction`` is generally trivial to implement. It makes the
				1819	incoming parameter as the global ``JITCompilerFunction`` and returns the
				1820	callback function that will be used a function wrapper. For the Alpha target
				1821	(in ``AlphaJITInfo.cpp``), the ``getLazyResolverFunction`` implementation is
				1822	simply:
				1823
				1824	.. code-block:: c++
				1825
				1826	TargetJITInfo::LazyResolverFn AlphaJITInfo::getLazyResolverFunction(
				1827	JITCompilerFn F) {
				1828	JITCompilerFunction = F;
				1829	return AlphaCompilationCallback;
				1830	}
				1831
				1832	For the X86 target, the ``getLazyResolverFunction`` implementation is a little
				1833	more complicated, because it returns a different callback function for
				1834	processors with SSE instructions and XMM registers.
				1835
				1836	The callback function initially saves and later restores the callee register
				1837	values, incoming arguments, and frame and return address. The callback
				1838	function needs low-level access to the registers or stack, so it is typically
				1839	implemented with assembler.
				1840