Blame - docs/WritingAnLLVMBackend.rst - fp2-dev/platform/external/llvm

blob: 7e243fa3ec1b724339e72e0b1e89519e557b1a6e [file] [log] [blame]

Dmitri Gribenko	91cb694	2012-12-01 12:13:48 +0000	[diff] [blame]	1	================================
				2	Writing an LLVM Compiler Backend
				3	================================
				4
				5	.. sectionauthor:: Mason Woo <http://www.woo.com> and Misha Brukman <http://misha.brukman.net>
				6
				7	.. contents::
				8	:local:
				9
				10	Introduction
				11	============
				12
				13	This document describes techniques for writing compiler backends that convert
				14	the LLVM Intermediate Representation (IR) to code for a specified machine or
				15	other languages. Code intended for a specific machine can take the form of
				16	either assembly code or binary code (usable for a JIT compiler).
				17
				18	The backend of LLVM features a target-independent code generator that may
				19	create output for several types of target CPUs --- including X86, PowerPC,
				20	ARM, and SPARC. The backend may also be used to generate code targeted at SPUs
				21	of the Cell processor or GPUs to support the execution of compute kernels.
				22
				23	The document focuses on existing examples found in subdirectories of
				24	``llvm/lib/Target`` in a downloaded LLVM release. In particular, this document
				25	focuses on the example of creating a static compiler (one that emits text
				26	assembly) for a SPARC target, because SPARC has fairly standard
				27	characteristics, such as a RISC instruction set and straightforward calling
				28	conventions.
				29
				30	Audience
				31	--------
				32
				33	The audience for this document is anyone who needs to write an LLVM backend to
				34	generate code for a specific hardware or software target.
				35
				36	Prerequisite Reading
				37	--------------------
				38
				39	These essential documents must be read before reading this document:
				40
				41	* `LLVM Language Reference Manual <LangRef.html>`_ --- a reference manual for
				42	the LLVM assembly language.
				43
				44	* :doc:`CodeGenerator` --- a guide to the components (classes and code
				45	generation algorithms) for translating the LLVM internal representation into
				46	machine code for a specified target. Pay particular attention to the
				47	descriptions of code generation stages: Instruction Selection, Scheduling and
				48	Formation, SSA-based Optimization, Register Allocation, Prolog/Epilog Code
				49	Insertion, Late Machine Code Optimizations, and Code Emission.
				50
				51	* :doc:`TableGenFundamentals` --- a document that describes the TableGen
				52	(``tblgen``) application that manages domain-specific information to support
				53	LLVM code generation. TableGen processes input from a target description
				54	file (``.td`` suffix) and generates C++ code that can be used for code
				55	generation.
				56
Dmitri Gribenko	b64f020	2012-12-12 17:02:44 +0000	[diff] [blame]	57	* :doc:`WritingAnLLVMPass` --- The assembly printer is a ``FunctionPass``, as
				58	are several ``SelectionDAG`` processing steps.
Dmitri Gribenko	91cb694	2012-12-01 12:13:48 +0000	[diff] [blame]	59
				60	To follow the SPARC examples in this document, have a copy of `The SPARC
				61	Architecture Manual, Version 8 <http://www.sparc.org/standards/V8.pdf>`_ for
				62	reference. For details about the ARM instruction set, refer to the `ARM
				63	Architecture Reference Manual <http://infocenter.arm.com/>`_. For more about
				64	the GNU Assembler format (``GAS``), see `Using As
				65	<http://sourceware.org/binutils/docs/as/index.html>`_, especially for the
				66	assembly printer. "Using As" contains a list of target machine dependent
				67	features.
				68
				69	Basic Steps
				70	-----------
				71
				72	To write a compiler backend for LLVM that converts the LLVM IR to code for a
				73	specified target (machine or other language), follow these steps:
				74
				75	* Create a subclass of the ``TargetMachine`` class that describes
				76	characteristics of your target machine. Copy existing examples of specific
				77	``TargetMachine`` class and header files; for example, start with
				78	``SparcTargetMachine.cpp`` and ``SparcTargetMachine.h``, but change the file
				79	names for your target. Similarly, change code that references "``Sparc``" to
				80	reference your target.
				81
				82	* Describe the register set of the target. Use TableGen to generate code for
				83	register definition, register aliases, and register classes from a
				84	target-specific ``RegisterInfo.td`` input file. You should also write
				85	additional code for a subclass of the ``TargetRegisterInfo`` class that
				86	represents the class register file data used for register allocation and also
				87	describes the interactions between registers.
				88
				89	* Describe the instruction set of the target. Use TableGen to generate code
				90	for target-specific instructions from target-specific versions of
				91	``TargetInstrFormats.td`` and ``TargetInstrInfo.td``. You should write
				92	additional code for a subclass of the ``TargetInstrInfo`` class to represent
				93	machine instructions supported by the target machine.
				94
				95	* Describe the selection and conversion of the LLVM IR from a Directed Acyclic
				96	Graph (DAG) representation of instructions to native target-specific
				97	instructions. Use TableGen to generate code that matches patterns and
				98	selects instructions based on additional information in a target-specific
				99	version of ``TargetInstrInfo.td``. Write code for ``XXXISelDAGToDAG.cpp``,
				100	where ``XXX`` identifies the specific target, to perform pattern matching and
				101	DAG-to-DAG instruction selection. Also write code in ``XXXISelLowering.cpp``
				102	to replace or remove operations and data types that are not supported
				103	natively in a SelectionDAG.
				104
				105	* Write code for an assembly printer that converts LLVM IR to a GAS format for
				106	your target machine. You should add assembly strings to the instructions
				107	defined in your target-specific version of ``TargetInstrInfo.td``. You
				108	should also write code for a subclass of ``AsmPrinter`` that performs the
				109	LLVM-to-assembly conversion and a trivial subclass of ``TargetAsmInfo``.
				110
				111	* Optionally, add support for subtargets (i.e., variants with different
				112	capabilities). You should also write code for a subclass of the
				113	``TargetSubtarget`` class, which allows you to use the ``-mcpu=`` and
				114	``-mattr=`` command-line options.
				115
				116	* Optionally, add JIT support and create a machine code emitter (subclass of
				117	``TargetJITInfo``) that is used to emit binary code directly into memory.
				118
				119	In the ``.cpp`` and ``.h``. files, initially stub up these methods and then
				120	implement them later. Initially, you may not know which private members that
				121	the class will need and which components will need to be subclassed.
				122
				123	Preliminaries
				124	-------------
				125
				126	To actually create your compiler backend, you need to create and modify a few
				127	files. The absolute minimum is discussed here. But to actually use the LLVM
				128	target-independent code generator, you must perform the steps described in the
				129	:doc:`LLVM Target-Independent Code Generator <CodeGenerator>` document.
				130
				131	First, you should create a subdirectory under ``lib/Target`` to hold all the
				132	files related to your target. If your target is called "Dummy", create the
				133	directory ``lib/Target/Dummy``.
				134
				135	In this new directory, create a ``Makefile``. It is easiest to copy a
				136	``Makefile`` of another target and modify it. It should at least contain the
				137	``LEVEL``, ``LIBRARYNAME`` and ``TARGET`` variables, and then include
				138	``$(LEVEL)/Makefile.common``. The library can be named ``LLVMDummy`` (for
				139	example, see the MIPS target). Alternatively, you can split the library into
				140	``LLVMDummyCodeGen`` and ``LLVMDummyAsmPrinter``, the latter of which should be
				141	implemented in a subdirectory below ``lib/Target/Dummy`` (for example, see the
				142	PowerPC target).
				143
				144	Note that these two naming schemes are hardcoded into ``llvm-config``. Using
				145	any other naming scheme will confuse ``llvm-config`` and produce a lot of
				146	(seemingly unrelated) linker errors when linking ``llc``.
				147
				148	To make your target actually do something, you need to implement a subclass of
				149	``TargetMachine``. This implementation should typically be in the file
				150	``lib/Target/DummyTargetMachine.cpp``, but any file in the ``lib/Target``
				151	directory will be built and should work. To use LLVM's target independent code
				152	generator, you should do what all current machine backends do: create a
				153	subclass of ``LLVMTargetMachine``. (To create a target from scratch, create a
				154	subclass of ``TargetMachine``.)
				155
				156	To get LLVM to actually build and link your target, you need to add it to the
				157	``TARGETS_TO_BUILD`` variable. To do this, you modify the configure script to
				158	know about your target when parsing the ``--enable-targets`` option. Search
				159	the configure script for ``TARGETS_TO_BUILD``, add your target to the lists
				160	there (some creativity required), and then reconfigure. Alternatively, you can
				161	change ``autotools/configure.ac`` and regenerate configure by running
				162	``./autoconf/AutoRegen.sh``.
				163
				164	Target Machine
				165	==============
				166
				167	``LLVMTargetMachine`` is designed as a base class for targets implemented with
				168	the LLVM target-independent code generator. The ``LLVMTargetMachine`` class
				169	should be specialized by a concrete target class that implements the various
				170	virtual methods. ``LLVMTargetMachine`` is defined as a subclass of
				171	``TargetMachine`` in ``include/llvm/Target/TargetMachine.h``. The
				172	``TargetMachine`` class implementation (``TargetMachine.cpp``) also processes
				173	numerous command-line options.
				174
				175	To create a concrete target-specific subclass of ``LLVMTargetMachine``, start
				176	by copying an existing ``TargetMachine`` class and header. You should name the
				177	files that you create to reflect your specific target. For instance, for the
				178	SPARC target, name the files ``SparcTargetMachine.h`` and
				179	``SparcTargetMachine.cpp``.
				180
				181	For a target machine ``XXX``, the implementation of ``XXXTargetMachine`` must
				182	have access methods to obtain objects that represent target components. These
				183	methods are named ``get*Info``, and are intended to obtain the instruction set
				184	(``getInstrInfo``), register set (``getRegisterInfo``), stack frame layout
				185	(``getFrameInfo``), and similar information. ``XXXTargetMachine`` must also
				186	implement the ``getDataLayout`` method to access an object with target-specific
				187	data characteristics, such as data type size and alignment requirements.
				188
				189	For instance, for the SPARC target, the header file ``SparcTargetMachine.h``
				190	declares prototypes for several ``get*Info`` and ``getDataLayout`` methods that
				191	simply return a class member.
				192
				193	.. code-block:: c++
				194
				195	namespace llvm {
				196
				197	class Module;
				198
				199	class SparcTargetMachine : public LLVMTargetMachine {
				200	const DataLayout DataLayout; // Calculates type size & alignment
				201	SparcSubtarget Subtarget;
				202	SparcInstrInfo InstrInfo;
				203	TargetFrameInfo FrameInfo;
				204
				205	protected:
				206	virtual const TargetAsmInfo *createTargetAsmInfo() const;
				207
				208	public:
				209	SparcTargetMachine(const Module &M, const std::string &FS);
				210
				211	virtual const SparcInstrInfo *getInstrInfo() const {return &InstrInfo; }
				212	virtual const TargetFrameInfo *getFrameInfo() const {return &FrameInfo; }
				213	virtual const TargetSubtarget *getSubtargetImpl() const{return &Subtarget; }
				214	virtual const TargetRegisterInfo *getRegisterInfo() const {
				215	return &InstrInfo.getRegisterInfo();
				216	}
				217	virtual const DataLayout *getDataLayout() const { return &DataLayout; }
				218	static unsigned getModuleMatchQuality(const Module &M);
				219
				220	// Pass Pipeline Configuration
				221	virtual bool addInstSelector(PassManagerBase &PM, bool Fast);
				222	virtual bool addPreEmitPass(PassManagerBase &PM, bool Fast);
				223	};
				224
				225	} // end namespace llvm
				226
				227	* ``getInstrInfo()``
				228	* ``getRegisterInfo()``
				229	* ``getFrameInfo()``
				230	* ``getDataLayout()``
				231	* ``getSubtargetImpl()``
				232
				233	For some targets, you also need to support the following methods:
				234
				235	* ``getTargetLowering()``
				236	* ``getJITInfo()``
				237
				238	In addition, the ``XXXTargetMachine`` constructor should specify a
				239	``TargetDescription`` string that determines the data layout for the target
				240	machine, including characteristics such as pointer size, alignment, and
				241	endianness. For example, the constructor for ``SparcTargetMachine`` contains
				242	the following:
				243
				244	.. code-block:: c++
				245
				246	SparcTargetMachine::SparcTargetMachine(const Module &M, const std::string &FS)
				247	: DataLayout("E-p:32:32-f128:128:128"),
				248	Subtarget(M, FS), InstrInfo(Subtarget),
				249	FrameInfo(TargetFrameInfo::StackGrowsDown, 8, 0) {
				250	}
				251
				252	Hyphens separate portions of the ``TargetDescription`` string.
				253
				254	* An upper-case "``E``" in the string indicates a big-endian target data model.
				255	A lower-case "``e``" indicates little-endian.
				256
				257	* "``p:``" is followed by pointer information: size, ABI alignment, and
				258	preferred alignment. If only two figures follow "``p:``", then the first
				259	value is pointer size, and the second value is both ABI and preferred
				260	alignment.
				261
				262	* Then a letter for numeric type alignment: "``i``", "``f``", "``v``", or
				263	"``a``" (corresponding to integer, floating point, vector, or aggregate).
				264	"``i``", "``v``", or "``a``" are followed by ABI alignment and preferred
				265	alignment. "``f``" is followed by three values: the first indicates the size
				266	of a long double, then ABI alignment, and then ABI preferred alignment.
				267
				268	Target Registration
				269	===================
				270
				271	You must also register your target with the ``TargetRegistry``, which is what
				272	other LLVM tools use to be able to lookup and use your target at runtime. The
				273	``TargetRegistry`` can be used directly, but for most targets there are helper
				274	templates which should take care of the work for you.
				275
				276	All targets should declare a global ``Target`` object which is used to
				277	represent the target during registration. Then, in the target's ``TargetInfo``
				278	library, the target should define that object and use the ``RegisterTarget``
				279	template to register the target. For example, the Sparc registration code
				280	looks like this:
				281
				282	.. code-block:: c++
				283
				284	Target llvm::TheSparcTarget;
				285
				286	extern "C" void LLVMInitializeSparcTargetInfo() {
				287	RegisterTarget<Triple::sparc, /HasJIT=/false>
				288	X(TheSparcTarget, "sparc", "Sparc");
				289	}
				290
				291	This allows the ``TargetRegistry`` to look up the target by name or by target
				292	triple. In addition, most targets will also register additional features which
				293	are available in separate libraries. These registration steps are separate,
				294	because some clients may wish to only link in some parts of the target --- the
				295	JIT code generator does not require the use of the assembler printer, for
				296	example. Here is an example of registering the Sparc assembly printer:
				297
				298	.. code-block:: c++
				299
				300	extern "C" void LLVMInitializeSparcAsmPrinter() {
				301	RegisterAsmPrinter<SparcAsmPrinter> X(TheSparcTarget);
				302	}
				303
				304	For more information, see "`llvm/Target/TargetRegistry.h
				305	</doxygen/TargetRegistry_8h-source.html>`_".
				306
				307	Register Set and Register Classes
				308	=================================
				309
				310	You should describe a concrete target-specific class that represents the
				311	register file of a target machine. This class is called ``XXXRegisterInfo``
				312	(where ``XXX`` identifies the target) and represents the class register file
				313	data that is used for register allocation. It also describes the interactions
				314	between registers.
				315
				316	You also need to define register classes to categorize related registers. A
				317	register class should be added for groups of registers that are all treated the
				318	same way for some instruction. Typical examples are register classes for
				319	integer, floating-point, or vector registers. A register allocator allows an
				320	instruction to use any register in a specified register class to perform the
				321	instruction in a similar manner. Register classes allocate virtual registers
				322	to instructions from these sets, and register classes let the
				323	target-independent register allocator automatically choose the actual
				324	registers.
				325
				326	Much of the code for registers, including register definition, register
				327	aliases, and register classes, is generated by TableGen from
				328	``XXXRegisterInfo.td`` input files and placed in ``XXXGenRegisterInfo.h.inc``
				329	and ``XXXGenRegisterInfo.inc`` output files. Some of the code in the
				330	implementation of ``XXXRegisterInfo`` requires hand-coding.
				331
				332	Defining a Register
				333	-------------------
				334
				335	The ``XXXRegisterInfo.td`` file typically starts with register definitions for
				336	a target machine. The ``Register`` class (specified in ``Target.td``) is used
				337	to define an object for each register. The specified string ``n`` becomes the
				338	``Name`` of the register. The basic ``Register`` object does not have any
				339	subregisters and does not specify any aliases.
				340
				341	.. code-block:: llvm
				342
				343	class Register<string n> {
				344	string Namespace = "";
				345	string AsmName = n;
				346	string Name = n;
				347	int SpillSize = 0;
				348	int SpillAlignment = 0;
				349	list<Register> Aliases = [];
				350	list<Register> SubRegs = [];
				351	list<int> DwarfNumbers = [];
				352	}
				353
				354	For example, in the ``X86RegisterInfo.td`` file, there are register definitions
				355	that utilize the ``Register`` class, such as:
				356
				357	.. code-block:: llvm
				358
				359	def AL : Register<"AL">, DwarfRegNum<[0, 0, 0]>;
				360
				361	This defines the register ``AL`` and assigns it values (with ``DwarfRegNum``)
				362	that are used by ``gcc``, ``gdb``, or a debug information writer to identify a
				363	register. For register ``AL``, ``DwarfRegNum`` takes an array of 3 values
				364	representing 3 different modes: the first element is for X86-64, the second for
				365	exception handling (EH) on X86-32, and the third is generic. -1 is a special
				366	Dwarf number that indicates the gcc number is undefined, and -2 indicates the
				367	register number is invalid for this mode.
				368
				369	From the previously described line in the ``X86RegisterInfo.td`` file, TableGen
				370	generates this code in the ``X86GenRegisterInfo.inc`` file:
				371
				372	.. code-block:: c++
				373
				374	static const unsigned GR8[] = { X86::AL, ... };
				375
				376	const unsigned AL_AliasSet[] = { X86::AX, X86::EAX, X86::RAX, 0 };
				377
				378	const TargetRegisterDesc RegisterDescriptors[] = {
				379	...
				380	{ "AL", "AL", AL_AliasSet, Empty_SubRegsSet, Empty_SubRegsSet, AL_SuperRegsSet }, ...
				381
				382	From the register info file, TableGen generates a ``TargetRegisterDesc`` object
				383	for each register. ``TargetRegisterDesc`` is defined in
				384	``include/llvm/Target/TargetRegisterInfo.h`` with the following fields:
				385
				386	.. code-block:: c++
				387
				388	struct TargetRegisterDesc {
				389	const char *AsmName; // Assembly language name for the register
				390	const char *Name; // Printable name for the reg (for debugging)
				391	const unsigned *AliasSet; // Register Alias Set
				392	const unsigned *SubRegs; // Sub-register set
				393	const unsigned *ImmSubRegs; // Immediate sub-register set
				394	const unsigned *SuperRegs; // Super-register set
				395	};
				396
				397	TableGen uses the entire target description file (``.td``) to determine text
				398	names for the register (in the ``AsmName`` and ``Name`` fields of
				399	``TargetRegisterDesc``) and the relationships of other registers to the defined
				400	register (in the other ``TargetRegisterDesc`` fields). In this example, other
				401	definitions establish the registers "``AX``", "``EAX``", and "``RAX``" as
				402	aliases for one another, so TableGen generates a null-terminated array
				403	(``AL_AliasSet``) for this register alias set.
				404
				405	The ``Register`` class is commonly used as a base class for more complex
				406	classes. In ``Target.td``, the ``Register`` class is the base for the
				407	``RegisterWithSubRegs`` class that is used to define registers that need to
				408	specify subregisters in the ``SubRegs`` list, as shown here:
				409
				410	.. code-block:: llvm
				411
				412	class RegisterWithSubRegs<string n, list<Register> subregs> : Register<n> {
				413	let SubRegs = subregs;
				414	}
				415
				416	In ``SparcRegisterInfo.td``, additional register classes are defined for SPARC:
				417	a ``Register`` subclass, ``SparcReg``, and further subclasses: ``Ri``, ``Rf``,
				418	and ``Rd``. SPARC registers are identified by 5-bit ID numbers, which is a
				419	feature common to these subclasses. Note the use of "``let``" expressions to
				420	override values that are initially defined in a superclass (such as ``SubRegs``
				421	field in the ``Rd`` class).
				422
				423	.. code-block:: llvm
				424
				425	class SparcReg<string n> : Register<n> {
				426	field bits<5> Num;
				427	let Namespace = "SP";
				428	}
				429	// Ri - 32-bit integer registers
				430	class Ri<bits<5> num, string n> :
				431	SparcReg<n> {
				432	let Num = num;
				433	}
				434	// Rf - 32-bit floating-point registers
				435	class Rf<bits<5> num, string n> :
				436	SparcReg<n> {
				437	let Num = num;
				438	}
				439	// Rd - Slots in the FP register file for 64-bit floating-point values.
				440	class Rd<bits<5> num, string n, list<Register> subregs> : SparcReg<n> {
				441	let Num = num;
				442	let SubRegs = subregs;
				443	}
				444
				445	In the ``SparcRegisterInfo.td`` file, there are register definitions that
				446	utilize these subclasses of ``Register``, such as:
				447
				448	.. code-block:: llvm
				449
				450	def G0 : Ri< 0, "G0">, DwarfRegNum<[0]>;
				451	def G1 : Ri< 1, "G1">, DwarfRegNum<[1]>;
				452	...
				453	def F0 : Rf< 0, "F0">, DwarfRegNum<[32]>;
				454	def F1 : Rf< 1, "F1">, DwarfRegNum<[33]>;
				455	...
				456	def D0 : Rd< 0, "F0", [F0, F1]>, DwarfRegNum<[32]>;
				457	def D1 : Rd< 2, "F2", [F2, F3]>, DwarfRegNum<[34]>;
				458
				459	The last two registers shown above (``D0`` and ``D1``) are double-precision
				460	floating-point registers that are aliases for pairs of single-precision
				461	floating-point sub-registers. In addition to aliases, the sub-register and
				462	super-register relationships of the defined register are in fields of a
				463	register's ``TargetRegisterDesc``.
				464
				465	Defining a Register Class
				466	-------------------------
				467
				468	The ``RegisterClass`` class (specified in ``Target.td``) is used to define an
				469	object that represents a group of related registers and also defines the
				470	default allocation order of the registers. A target description file
				471	``XXXRegisterInfo.td`` that uses ``Target.td`` can construct register classes
				472	using the following class:
				473
				474	.. code-block:: llvm
				475
				476	class RegisterClass<string namespace,
				477	list<ValueType> regTypes, int alignment, dag regList> {
				478	string Namespace = namespace;
				479	list<ValueType> RegTypes = regTypes;
				480	int Size = 0; // spill size, in bits; zero lets tblgen pick the size
				481	int Alignment = alignment;
				482
				483	// CopyCost is the cost of copying a value between two registers
				484	// default value 1 means a single instruction
				485	// A negative value means copying is extremely expensive or impossible
				486	int CopyCost = 1;
				487	dag MemberList = regList;
				488
				489	// for register classes that are subregisters of this class
				490	list<RegisterClass> SubRegClassList = [];
				491
				492	code MethodProtos = [{}]; // to insert arbitrary code
				493	code MethodBodies = [{}];
				494	}
				495
				496	To define a ``RegisterClass``, use the following 4 arguments:
				497
				498	* The first argument of the definition is the name of the namespace.
				499
				500	* The second argument is a list of ``ValueType`` register type values that are
				501	defined in ``include/llvm/CodeGen/ValueTypes.td``. Defined values include
				502	integer types (such as ``i16``, ``i32``, and ``i1`` for Boolean),
				503	floating-point types (``f32``, ``f64``), and vector types (for example,
				504	``v8i16`` for an ``8 x i16`` vector). All registers in a ``RegisterClass``
				505	must have the same ``ValueType``, but some registers may store vector data in
				506	different configurations. For example a register that can process a 128-bit
				507	vector may be able to handle 16 8-bit integer elements, 8 16-bit integers, 4
				508	32-bit integers, and so on.
				509
				510	* The third argument of the ``RegisterClass`` definition specifies the
				511	alignment required of the registers when they are stored or loaded to
				512	memory.
				513
				514	* The final argument, ``regList``, specifies which registers are in this class.
				515	If an alternative allocation order method is not specified, then ``regList``
				516	also defines the order of allocation used by the register allocator. Besides
				517	simply listing registers with ``(add R0, R1, ...)``, more advanced set
				518	operators are available. See ``include/llvm/Target/Target.td`` for more
				519	information.
				520
				521	In ``SparcRegisterInfo.td``, three ``RegisterClass`` objects are defined:
				522	``FPRegs``, ``DFPRegs``, and ``IntRegs``. For all three register classes, the
				523	first argument defines the namespace with the string "``SP``". ``FPRegs``
				524	defines a group of 32 single-precision floating-point registers (``F0`` to
				525	``F31``); ``DFPRegs`` defines a group of 16 double-precision registers
				526	(``D0-D15``).
				527
				528	.. code-block:: llvm
				529
				530	// F0, F1, F2, ..., F31
				531	def FPRegs : RegisterClass<"SP", [f32], 32, (sequence "F%u", 0, 31)>;
				532
				533	def DFPRegs : RegisterClass<"SP", [f64], 64,
				534	(add D0, D1, D2, D3, D4, D5, D6, D7, D8,
				535	D9, D10, D11, D12, D13, D14, D15)>;
				536
				537	def IntRegs : RegisterClass<"SP", [i32], 32,
				538	(add L0, L1, L2, L3, L4, L5, L6, L7,
				539	I0, I1, I2, I3, I4, I5,
				540	O0, O1, O2, O3, O4, O5, O7,
				541	G1,
				542	// Non-allocatable regs:
				543	G2, G3, G4,
				544	O6, // stack ptr
				545	I6, // frame ptr
				546	I7, // return address
				547	G0, // constant zero
				548	G5, G6, G7 // reserved for kernel
				549	)>;
				550
				551	Using ``SparcRegisterInfo.td`` with TableGen generates several output files
				552	that are intended for inclusion in other source code that you write.
				553	``SparcRegisterInfo.td`` generates ``SparcGenRegisterInfo.h.inc``, which should
				554	be included in the header file for the implementation of the SPARC register
				555	implementation that you write (``SparcRegisterInfo.h``). In
				556	``SparcGenRegisterInfo.h.inc`` a new structure is defined called
				557	``SparcGenRegisterInfo`` that uses ``TargetRegisterInfo`` as its base. It also
				558	specifies types, based upon the defined register classes: ``DFPRegsClass``,
				559	``FPRegsClass``, and ``IntRegsClass``.
				560
				561	``SparcRegisterInfo.td`` also generates ``SparcGenRegisterInfo.inc``, which is
				562	included at the bottom of ``SparcRegisterInfo.cpp``, the SPARC register
				563	implementation. The code below shows only the generated integer registers and
				564	associated register classes. The order of registers in ``IntRegs`` reflects
				565	the order in the definition of ``IntRegs`` in the target description file.
				566
				567	.. code-block:: c++
				568
				569	// IntRegs Register Class...
				570	static const unsigned IntRegs[] = {
				571	SP::L0, SP::L1, SP::L2, SP::L3, SP::L4, SP::L5,
				572	SP::L6, SP::L7, SP::I0, SP::I1, SP::I2, SP::I3,
				573	SP::I4, SP::I5, SP::O0, SP::O1, SP::O2, SP::O3,
				574	SP::O4, SP::O5, SP::O7, SP::G1, SP::G2, SP::G3,
				575	SP::G4, SP::O6, SP::I6, SP::I7, SP::G0, SP::G5,
				576	SP::G6, SP::G7,
				577	};
				578
				579	// IntRegsVTs Register Class Value Types...
				580	static const MVT::ValueType IntRegsVTs[] = {
				581	MVT::i32, MVT::Other
				582	};
				583
				584	namespace SP { // Register class instances
				585	DFPRegsClass DFPRegsRegClass;
				586	FPRegsClass FPRegsRegClass;
				587	IntRegsClass IntRegsRegClass;
				588	...
				589	// IntRegs Sub-register Classess...
				590	static const TargetRegisterClass* const IntRegsSubRegClasses [] = {
				591	NULL
				592	};
				593	...
				594	// IntRegs Super-register Classess...
				595	static const TargetRegisterClass* const IntRegsSuperRegClasses [] = {
				596	NULL
				597	};
				598	...
				599	// IntRegs Register Class sub-classes...
				600	static const TargetRegisterClass* const IntRegsSubclasses [] = {
				601	NULL
				602	};
				603	...
				604	// IntRegs Register Class super-classes...
				605	static const TargetRegisterClass* const IntRegsSuperclasses [] = {
				606	NULL
				607	};
				608
				609	IntRegsClass::IntRegsClass() : TargetRegisterClass(IntRegsRegClassID,
				610	IntRegsVTs, IntRegsSubclasses, IntRegsSuperclasses, IntRegsSubRegClasses,
				611	IntRegsSuperRegClasses, 4, 4, 1, IntRegs, IntRegs + 32) {}
				612	}
				613
				614	The register allocators will avoid using reserved registers, and callee saved
				615	registers are not used until all the volatile registers have been used. That
				616	is usually good enough, but in some cases it may be necessary to provide custom
				617	allocation orders.
				618
				619	Implement a subclass of ``TargetRegisterInfo``
				620	----------------------------------------------
				621
				622	The final step is to hand code portions of ``XXXRegisterInfo``, which
				623	implements the interface described in ``TargetRegisterInfo.h`` (see
				624	:ref:`TargetRegisterInfo`). These functions return ``0``, ``NULL``, or
				625	``false``, unless overridden. Here is a list of functions that are overridden
				626	for the SPARC implementation in ``SparcRegisterInfo.cpp``:
				627
				628	* ``getCalleeSavedRegs`` --- Returns a list of callee-saved registers in the
				629	order of the desired callee-save stack frame offset.
				630
				631	* ``getReservedRegs`` --- Returns a bitset indexed by physical register
				632	numbers, indicating if a particular register is unavailable.
				633
				634	* ``hasFP`` --- Return a Boolean indicating if a function should have a
				635	dedicated frame pointer register.
				636
				637	* ``eliminateCallFramePseudoInstr`` --- If call frame setup or destroy pseudo
				638	instructions are used, this can be called to eliminate them.
				639
				640	* ``eliminateFrameIndex`` --- Eliminate abstract frame indices from
				641	instructions that may use them.
				642
				643	* ``emitPrologue`` --- Insert prologue code into the function.
				644
				645	* ``emitEpilogue`` --- Insert epilogue code into the function.
				646
				647	.. _instruction-set:
				648
				649	Instruction Set
				650	===============
				651
				652	During the early stages of code generation, the LLVM IR code is converted to a
				653	``SelectionDAG`` with nodes that are instances of the ``SDNode`` class
				654	containing target instructions. An ``SDNode`` has an opcode, operands, type
				655	requirements, and operation properties. For example, is an operation
				656	commutative, does an operation load from memory. The various operation node
				657	types are described in the ``include/llvm/CodeGen/SelectionDAGNodes.h`` file
				658	(values of the ``NodeType`` enum in the ``ISD`` namespace).
				659
				660	TableGen uses the following target description (``.td``) input files to
				661	generate much of the code for instruction definition:
				662
				663	* ``Target.td`` --- Where the ``Instruction``, ``Operand``, ``InstrInfo``, and
				664	other fundamental classes are defined.
				665
				666	* ``TargetSelectionDAG.td`` --- Used by ``SelectionDAG`` instruction selection
				667	generators, contains ``SDTC*`` classes (selection DAG type constraint),
				668	definitions of ``SelectionDAG`` nodes (such as ``imm``, ``cond``, ``bb``,
				669	``add``, ``fadd``, ``sub``), and pattern support (``Pattern``, ``Pat``,
				670	``PatFrag``, ``PatLeaf``, ``ComplexPattern``.
				671
				672	* ``XXXInstrFormats.td`` --- Patterns for definitions of target-specific
				673	instructions.
				674
				675	* ``XXXInstrInfo.td`` --- Target-specific definitions of instruction templates,
				676	condition codes, and instructions of an instruction set. For architecture
				677	modifications, a different file name may be used. For example, for Pentium
				678	with SSE instruction, this file is ``X86InstrSSE.td``, and for Pentium with
				679	MMX, this file is ``X86InstrMMX.td``.
				680
				681	There is also a target-specific ``XXX.td`` file, where ``XXX`` is the name of
				682	the target. The ``XXX.td`` file includes the other ``.td`` input files, but
				683	its contents are only directly important for subtargets.
				684
				685	You should describe a concrete target-specific class ``XXXInstrInfo`` that
				686	represents machine instructions supported by a target machine.
				687	``XXXInstrInfo`` contains an array of ``XXXInstrDescriptor`` objects, each of
				688	which describes one instruction. An instruction descriptor defines:
				689
				690	* Opcode mnemonic
				691	* Number of operands
				692	* List of implicit register definitions and uses
				693	* Target-independent properties (such as memory access, is commutable)
				694	* Target-specific flags
				695
				696	The Instruction class (defined in ``Target.td``) is mostly used as a base for
				697	more complex instruction classes.
				698
				699	.. code-block:: llvm
				700
				701	class Instruction {
				702	string Namespace = "";
				703	dag OutOperandList; // A dag containing the MI def operand list.
				704	dag InOperandList; // A dag containing the MI use operand list.
				705	string AsmString = ""; // The .s format to print the instruction with.
				706	list<dag> Pattern; // Set to the DAG pattern for this instruction.
				707	list<Register> Uses = [];
				708	list<Register> Defs = [];
				709	list<Predicate> Predicates = []; // predicates turned into isel match code
				710	... remainder not shown for space ...
				711	}
				712
				713	A ``SelectionDAG`` node (``SDNode``) should contain an object representing a
				714	target-specific instruction that is defined in ``XXXInstrInfo.td``. The
				715	instruction objects should represent instructions from the architecture manual
				716	of the target machine (such as the SPARC Architecture Manual for the SPARC
				717	target).
				718
				719	A single instruction from the architecture manual is often modeled as multiple
				720	target instructions, depending upon its operands. For example, a manual might
				721	describe an add instruction that takes a register or an immediate operand. An
				722	LLVM target could model this with two instructions named ``ADDri`` and
				723	``ADDrr``.
				724
				725	You should define a class for each instruction category and define each opcode
				726	as a subclass of the category with appropriate parameters such as the fixed
				727	binary encoding of opcodes and extended opcodes. You should map the register
				728	bits to the bits of the instruction in which they are encoded (for the JIT).
				729	Also you should specify how the instruction should be printed when the
				730	automatic assembly printer is used.
				731
				732	As is described in the SPARC Architecture Manual, Version 8, there are three
				733	major 32-bit formats for instructions. Format 1 is only for the ``CALL``
				734	instruction. Format 2 is for branch on condition codes and ``SETHI`` (set high
				735	bits of a register) instructions. Format 3 is for other instructions.
				736
				737	Each of these formats has corresponding classes in ``SparcInstrFormat.td``.
				738	``InstSP`` is a base class for other instruction classes. Additional base
				739	classes are specified for more precise formats: for example in
				740	``SparcInstrFormat.td``, ``F2_1`` is for ``SETHI``, and ``F2_2`` is for
				741	branches. There are three other base classes: ``F3_1`` for register/register
				742	operations, ``F3_2`` for register/immediate operations, and ``F3_3`` for
				743	floating-point operations. ``SparcInstrInfo.td`` also adds the base class
				744	``Pseudo`` for synthetic SPARC instructions.
				745
				746	``SparcInstrInfo.td`` largely consists of operand and instruction definitions
				747	for the SPARC target. In ``SparcInstrInfo.td``, the following target
				748	description file entry, ``LDrr``, defines the Load Integer instruction for a
				749	Word (the ``LD`` SPARC opcode) from a memory address to a register. The first
				750	parameter, the value 3 (``11``\ :sub:`2`), is the operation value for this
				751	category of operation. The second parameter (``000000``\ :sub:`2`) is the
				752	specific operation value for ``LD``/Load Word. The third parameter is the
				753	output destination, which is a register operand and defined in the ``Register``
				754	target description file (``IntRegs``).
				755
				756	.. code-block:: llvm
				757
				758	def LDrr : F3_1 <3, 0b000000, (outs IntRegs:$dst), (ins MEMrr:$addr),
				759	"ld [$addr], $dst",
				760	[(set IntRegs:$dst, (load ADDRrr:$addr))]>;
				761
				762	The fourth parameter is the input source, which uses the address operand
				763	``MEMrr`` that is defined earlier in ``SparcInstrInfo.td``:
				764
				765	.. code-block:: llvm
				766
				767	def MEMrr : Operand<i32> {
				768	let PrintMethod = "printMemOperand";
				769	let MIOperandInfo = (ops IntRegs, IntRegs);
				770	}
				771
				772	The fifth parameter is a string that is used by the assembly printer and can be
				773	left as an empty string until the assembly printer interface is implemented.
				774	The sixth and final parameter is the pattern used to match the instruction
				775	during the SelectionDAG Select Phase described in :doc:`CodeGenerator`.
				776	This parameter is detailed in the next section, :ref:`instruction-selector`.
				777
				778	Instruction class definitions are not overloaded for different operand types,
				779	so separate versions of instructions are needed for register, memory, or
				780	immediate value operands. For example, to perform a Load Integer instruction
				781	for a Word from an immediate operand to a register, the following instruction
				782	class is defined:
				783
				784	.. code-block:: llvm
				785
				786	def LDri : F3_2 <3, 0b000000, (outs IntRegs:$dst), (ins MEMri:$addr),
				787	"ld [$addr], $dst",
				788	[(set IntRegs:$dst, (load ADDRri:$addr))]>;
				789
				790	Writing these definitions for so many similar instructions can involve a lot of
				791	cut and paste. In ``.td`` files, the ``multiclass`` directive enables the
				792	creation of templates to define several instruction classes at once (using the
				793	``defm`` directive). For example in ``SparcInstrInfo.td``, the ``multiclass``
				794	pattern ``F3_12`` is defined to create 2 instruction classes each time
				795	``F3_12`` is invoked:
				796
				797	.. code-block:: llvm
				798
				799	multiclass F3_12 <string OpcStr, bits<6> Op3Val, SDNode OpNode> {
				800	def rr : F3_1 <2, Op3Val,
				801	(outs IntRegs:$dst), (ins IntRegs:$b, IntRegs:$c),
				802	!strconcat(OpcStr, " $b, $c, $dst"),
				803	[(set IntRegs:$dst, (OpNode IntRegs:$b, IntRegs:$c))]>;
				804	def ri : F3_2 <2, Op3Val,
				805	(outs IntRegs:$dst), (ins IntRegs:$b, i32imm:$c),
				806	!strconcat(OpcStr, " $b, $c, $dst"),
				807	[(set IntRegs:$dst, (OpNode IntRegs:$b, simm13:$c))]>;
				808	}
				809
				810	So when the ``defm`` directive is used for the ``XOR`` and ``ADD``
				811	instructions, as seen below, it creates four instruction objects: ``XORrr``,
				812	``XORri``, ``ADDrr``, and ``ADDri``.
				813
				814	.. code-block:: llvm
				815
				816	defm XOR : F3_12<"xor", 0b000011, xor>;
				817	defm ADD : F3_12<"add", 0b000000, add>;
				818
				819	``SparcInstrInfo.td`` also includes definitions for condition codes that are
				820	referenced by branch instructions. The following definitions in
				821	``SparcInstrInfo.td`` indicate the bit location of the SPARC condition code.
				822	For example, the 10\ :sup:`th` bit represents the "greater than" condition for
				823	integers, and the 22\ :sup:`nd` bit represents the "greater than" condition for
				824	floats.
				825
				826	.. code-block:: llvm
				827
				828	def ICC_NE : ICC_VAL< 9>; // Not Equal
				829	def ICC_E : ICC_VAL< 1>; // Equal
				830	def ICC_G : ICC_VAL<10>; // Greater
				831	...
				832	def FCC_U : FCC_VAL<23>; // Unordered
				833	def FCC_G : FCC_VAL<22>; // Greater
				834	def FCC_UG : FCC_VAL<21>; // Unordered or Greater
				835	...
				836
				837	(Note that ``Sparc.h`` also defines enums that correspond to the same SPARC
				838	condition codes. Care must be taken to ensure the values in ``Sparc.h``
				839	correspond to the values in ``SparcInstrInfo.td``. I.e., ``SPCC::ICC_NE = 9``,
				840	``SPCC::FCC_U = 23`` and so on.)
				841
				842	Instruction Operand Mapping
				843	---------------------------
				844
				845	The code generator backend maps instruction operands to fields in the
				846	instruction. Operands are assigned to unbound fields in the instruction in the
				847	order they are defined. Fields are bound when they are assigned a value. For
				848	example, the Sparc target defines the ``XNORrr`` instruction as a ``F3_1``
				849	format instruction having three operands.
				850
				851	.. code-block:: llvm
				852
				853	def XNORrr : F3_1<2, 0b000111,
				854	(outs IntRegs:$dst), (ins IntRegs:$b, IntRegs:$c),
				855	"xnor $b, $c, $dst",
				856	[(set IntRegs:$dst, (not (xor IntRegs:$b, IntRegs:$c)))]>;
				857
				858	The instruction templates in ``SparcInstrFormats.td`` show the base class for
				859	``F3_1`` is ``InstSP``.
				860
				861	.. code-block:: llvm
				862
				863	class InstSP<dag outs, dag ins, string asmstr, list<dag> pattern> : Instruction {
				864	field bits<32> Inst;
				865	let Namespace = "SP";
				866	bits<2> op;
				867	let Inst{31-30} = op;
				868	dag OutOperandList = outs;
				869	dag InOperandList = ins;
				870	let AsmString = asmstr;
				871	let Pattern = pattern;
				872	}
				873
				874	``InstSP`` leaves the ``op`` field unbound.
				875
				876	.. code-block:: llvm
				877
				878	class F3<dag outs, dag ins, string asmstr, list<dag> pattern>
				879	: InstSP<outs, ins, asmstr, pattern> {
				880	bits<5> rd;
				881	bits<6> op3;
				882	bits<5> rs1;
				883	let op{1} = 1; // Op = 2 or 3
				884	let Inst{29-25} = rd;
				885	let Inst{24-19} = op3;
				886	let Inst{18-14} = rs1;
				887	}
				888
				889	``F3`` binds the ``op`` field and defines the ``rd``, ``op3``, and ``rs1``
				890	fields. ``F3`` format instructions will bind the operands ``rd``, ``op3``, and
				891	``rs1`` fields.
				892
				893	.. code-block:: llvm
				894
				895	class F3_1<bits<2> opVal, bits<6> op3val, dag outs, dag ins,
				896	string asmstr, list<dag> pattern> : F3<outs, ins, asmstr, pattern> {
				897	bits<8> asi = 0; // asi not currently used
				898	bits<5> rs2;
				899	let op = opVal;
				900	let op3 = op3val;
				901	let Inst{13} = 0; // i field = 0
				902	let Inst{12-5} = asi; // address space identifier
				903	let Inst{4-0} = rs2;
				904	}
				905
				906	``F3_1`` binds the ``op3`` field and defines the ``rs2`` fields. ``F3_1``
				907	format instructions will bind the operands to the ``rd``, ``rs1``, and ``rs2``
				908	fields. This results in the ``XNORrr`` instruction binding ``$dst``, ``$b``,
				909	and ``$c`` operands to the ``rd``, ``rs1``, and ``rs2`` fields respectively.
				910
				911	Instruction Relation Mapping
				912	----------------------------
				913
				914	This TableGen feature is used to relate instructions with each other. It is
				915	particularly useful when you have multiple instruction formats and need to
				916	switch between them after instruction selection. This entire feature is driven
				917	by relation models which can be defined in ``XXXInstrInfo.td`` files
				918	according to the target-specific instruction set. Relation models are defined
				919	using ``InstrMapping`` class as a base. TableGen parses all the models
				920	and generates instruction relation maps using the specified information.
				921	Relation maps are emitted as tables in the ``XXXGenInstrInfo.inc`` file
				922	along with the functions to query them. For the detailed information on how to
				923	use this feature, please refer to :doc:`HowToUseInstrMappings`.
				924
				925	Implement a subclass of ``TargetInstrInfo``
				926	-------------------------------------------
				927
				928	The final step is to hand code portions of ``XXXInstrInfo``, which implements
				929	the interface described in ``TargetInstrInfo.h`` (see :ref:`TargetInstrInfo`).
				930	These functions return ``0`` or a Boolean or they assert, unless overridden.
				931	Here's a list of functions that are overridden for the SPARC implementation in
				932	``SparcInstrInfo.cpp``:
				933
				934	* ``isLoadFromStackSlot`` --- If the specified machine instruction is a direct
				935	load from a stack slot, return the register number of the destination and the
				936	``FrameIndex`` of the stack slot.
				937
				938	* ``isStoreToStackSlot`` --- If the specified machine instruction is a direct
				939	store to a stack slot, return the register number of the destination and the
				940	``FrameIndex`` of the stack slot.
				941
				942	* ``copyPhysReg`` --- Copy values between a pair of physical registers.
				943
				944	* ``storeRegToStackSlot`` --- Store a register value to a stack slot.
				945
				946	* ``loadRegFromStackSlot`` --- Load a register value from a stack slot.
				947
				948	* ``storeRegToAddr`` --- Store a register value to memory.
				949
				950	* ``loadRegFromAddr`` --- Load a register value from memory.
				951
				952	* ``foldMemoryOperand`` --- Attempt to combine instructions of any load or
				953	store instruction for the specified operand(s).
				954
				955	Branch Folding and If Conversion
				956	--------------------------------
				957
				958	Performance can be improved by combining instructions or by eliminating
				959	instructions that are never reached. The ``AnalyzeBranch`` method in
				960	``XXXInstrInfo`` may be implemented to examine conditional instructions and
				961	remove unnecessary instructions. ``AnalyzeBranch`` looks at the end of a
				962	machine basic block (MBB) for opportunities for improvement, such as branch
				963	folding and if conversion. The ``BranchFolder`` and ``IfConverter`` machine
				964	function passes (see the source files ``BranchFolding.cpp`` and
				965	``IfConversion.cpp`` in the ``lib/CodeGen`` directory) call ``AnalyzeBranch``
				966	to improve the control flow graph that represents the instructions.
				967
				968	Several implementations of ``AnalyzeBranch`` (for ARM, Alpha, and X86) can be
				969	examined as models for your own ``AnalyzeBranch`` implementation. Since SPARC
				970	does not implement a useful ``AnalyzeBranch``, the ARM target implementation is
				971	shown below.
				972
				973	``AnalyzeBranch`` returns a Boolean value and takes four parameters:
				974
				975	* ``MachineBasicBlock &MBB`` --- The incoming block to be examined.
				976
				977	* ``MachineBasicBlock *&TBB`` --- A destination block that is returned. For a
				978	conditional branch that evaluates to true, ``TBB`` is the destination.
				979
				980	* ``MachineBasicBlock *&FBB`` --- For a conditional branch that evaluates to
				981	false, ``FBB`` is returned as the destination.
				982
				983	* ``std::vector<MachineOperand> &Cond`` --- List of operands to evaluate a
				984	condition for a conditional branch.
				985
				986	In the simplest case, if a block ends without a branch, then it falls through
				987	to the successor block. No destination blocks are specified for either ``TBB``
				988	or ``FBB``, so both parameters return ``NULL``. The start of the
				989	``AnalyzeBranch`` (see code below for the ARM target) shows the function
				990	parameters and the code for the simplest case.
				991
				992	.. code-block:: c++
				993
				994	bool ARMInstrInfo::AnalyzeBranch(MachineBasicBlock &MBB,
				995	MachineBasicBlock *&TBB,
				996	MachineBasicBlock *&FBB,
				997	std::vector<MachineOperand> &Cond) const
				998	{
				999	MachineBasicBlock::iterator I = MBB.end();
				1000	if (I == MBB.begin() \|\| !isUnpredicatedTerminator(--I))
				1001	return false;
				1002
				1003	If a block ends with a single unconditional branch instruction, then
				1004	``AnalyzeBranch`` (shown below) should return the destination of that branch in
				1005	the ``TBB`` parameter.
				1006
				1007	.. code-block:: c++
				1008
				1009	if (LastOpc == ARM::B \|\| LastOpc == ARM::tB) {
				1010	TBB = LastInst->getOperand(0).getMBB();
				1011	return false;
				1012	}
				1013
				1014	If a block ends with two unconditional branches, then the second branch is
				1015	never reached. In that situation, as shown below, remove the last branch
				1016	instruction and return the penultimate branch in the ``TBB`` parameter.
				1017
				1018	.. code-block:: c++
				1019
				1020	if ((SecondLastOpc == ARM::B \|\| SecondLastOpc == ARM::tB) &&
				1021	(LastOpc == ARM::B \|\| LastOpc == ARM::tB)) {
				1022	TBB = SecondLastInst->getOperand(0).getMBB();
				1023	I = LastInst;
				1024	I->eraseFromParent();
				1025	return false;
				1026	}
				1027
				1028	A block may end with a single conditional branch instruction that falls through
				1029	to successor block if the condition evaluates to false. In that case,
				1030	``AnalyzeBranch`` (shown below) should return the destination of that
				1031	conditional branch in the ``TBB`` parameter and a list of operands in the
				1032	``Cond`` parameter to evaluate the condition.
				1033
				1034	.. code-block:: c++
				1035
				1036	if (LastOpc == ARM::Bcc \|\| LastOpc == ARM::tBcc) {
				1037	// Block ends with fall-through condbranch.
				1038	TBB = LastInst->getOperand(0).getMBB();
				1039	Cond.push_back(LastInst->getOperand(1));
				1040	Cond.push_back(LastInst->getOperand(2));
				1041	return false;
				1042	}
				1043
				1044	If a block ends with both a conditional branch and an ensuing unconditional
				1045	branch, then ``AnalyzeBranch`` (shown below) should return the conditional
				1046	branch destination (assuming it corresponds to a conditional evaluation of
				1047	"``true``") in the ``TBB`` parameter and the unconditional branch destination
				1048	in the ``FBB`` (corresponding to a conditional evaluation of "``false``"). A
				1049	list of operands to evaluate the condition should be returned in the ``Cond``
				1050	parameter.
				1051
				1052	.. code-block:: c++
				1053
				1054	unsigned SecondLastOpc = SecondLastInst->getOpcode();
				1055
				1056	if ((SecondLastOpc == ARM::Bcc && LastOpc == ARM::B) \|\|
				1057	(SecondLastOpc == ARM::tBcc && LastOpc == ARM::tB)) {
				1058	TBB = SecondLastInst->getOperand(0).getMBB();
				1059	Cond.push_back(SecondLastInst->getOperand(1));
				1060	Cond.push_back(SecondLastInst->getOperand(2));
				1061	FBB = LastInst->getOperand(0).getMBB();
				1062	return false;
				1063	}
				1064
				1065	For the last two cases (ending with a single conditional branch or ending with
				1066	one conditional and one unconditional branch), the operands returned in the
				1067	``Cond`` parameter can be passed to methods of other instructions to create new
				1068	branches or perform other operations. An implementation of ``AnalyzeBranch``
				1069	requires the helper methods ``RemoveBranch`` and ``InsertBranch`` to manage
				1070	subsequent operations.
				1071
				1072	``AnalyzeBranch`` should return false indicating success in most circumstances.
				1073	``AnalyzeBranch`` should only return true when the method is stumped about what
				1074	to do, for example, if a block has three terminating branches.
				1075	``AnalyzeBranch`` may return true if it encounters a terminator it cannot
				1076	handle, such as an indirect branch.
				1077
				1078	.. _instruction-selector:
				1079
				1080	Instruction Selector
				1081	====================
				1082
				1083	LLVM uses a ``SelectionDAG`` to represent LLVM IR instructions, and nodes of
				1084	the ``SelectionDAG`` ideally represent native target instructions. During code
				1085	generation, instruction selection passes are performed to convert non-native
				1086	DAG instructions into native target-specific instructions. The pass described
				1087	in ``XXXISelDAGToDAG.cpp`` is used to match patterns and perform DAG-to-DAG
				1088	instruction selection. Optionally, a pass may be defined (in
				1089	``XXXBranchSelector.cpp``) to perform similar DAG-to-DAG operations for branch
				1090	instructions. Later, the code in ``XXXISelLowering.cpp`` replaces or removes
				1091	operations and data types not supported natively (legalizes) in a
				1092	``SelectionDAG``.
				1093
				1094	TableGen generates code for instruction selection using the following target
				1095	description input files:
				1096
				1097	* ``XXXInstrInfo.td`` --- Contains definitions of instructions in a
				1098	target-specific instruction set, generates ``XXXGenDAGISel.inc``, which is
				1099	included in ``XXXISelDAGToDAG.cpp``.
				1100
				1101	* ``XXXCallingConv.td`` --- Contains the calling and return value conventions
				1102	for the target architecture, and it generates ``XXXGenCallingConv.inc``,
				1103	which is included in ``XXXISelLowering.cpp``.
				1104
				1105	The implementation of an instruction selection pass must include a header that
				1106	declares the ``FunctionPass`` class or a subclass of ``FunctionPass``. In
				1107	``XXXTargetMachine.cpp``, a Pass Manager (PM) should add each instruction
				1108	selection pass into the queue of passes to run.
				1109
				1110	The LLVM static compiler (``llc``) is an excellent tool for visualizing the
				1111	contents of DAGs. To display the ``SelectionDAG`` before or after specific
				1112	processing phases, use the command line options for ``llc``, described at
				1113	:ref:`SelectionDAG-Process`.
				1114
				1115	To describe instruction selector behavior, you should add patterns for lowering
				1116	LLVM code into a ``SelectionDAG`` as the last parameter of the instruction
				1117	definitions in ``XXXInstrInfo.td``. For example, in ``SparcInstrInfo.td``,
				1118	this entry defines a register store operation, and the last parameter describes
				1119	a pattern with the store DAG operator.
				1120
				1121	.. code-block:: llvm
				1122
				1123	def STrr : F3_1< 3, 0b000100, (outs), (ins MEMrr:$addr, IntRegs:$src),
				1124	"st $src, [$addr]", [(store IntRegs:$src, ADDRrr:$addr)]>;
				1125
				1126	``ADDRrr`` is a memory mode that is also defined in ``SparcInstrInfo.td``:
				1127
				1128	.. code-block:: llvm
				1129
				1130	def ADDRrr : ComplexPattern<i32, 2, "SelectADDRrr", [], []>;
				1131
				1132	The definition of ``ADDRrr`` refers to ``SelectADDRrr``, which is a function
				1133	defined in an implementation of the Instructor Selector (such as
				1134	``SparcISelDAGToDAG.cpp``).
				1135
				1136	In ``lib/Target/TargetSelectionDAG.td``, the DAG operator for store is defined
				1137	below:
				1138
				1139	.. code-block:: llvm
				1140
				1141	def store : PatFrag<(ops node:$val, node:$ptr),
				1142	(st node:$val, node:$ptr), [{
				1143	if (StoreSDNode *ST = dyn_cast<StoreSDNode>(N))
				1144	return !ST->isTruncatingStore() &&
				1145	ST->getAddressingMode() == ISD::UNINDEXED;
				1146	return false;
				1147	}]>;
				1148
				1149	``XXXInstrInfo.td`` also generates (in ``XXXGenDAGISel.inc``) the
				1150	``SelectCode`` method that is used to call the appropriate processing method
				1151	for an instruction. In this example, ``SelectCode`` calls ``Select_ISD_STORE``
				1152	for the ``ISD::STORE`` opcode.
				1153
				1154	.. code-block:: c++
				1155
				1156	SDNode *SelectCode(SDValue N) {
				1157	...
				1158	MVT::ValueType NVT = N.getNode()->getValueType(0);
				1159	switch (N.getOpcode()) {
				1160	case ISD::STORE: {
				1161	switch (NVT) {
				1162	default:
				1163	return Select_ISD_STORE(N);
				1164	break;
				1165	}
				1166	break;
				1167	}
				1168	...
				1169
				1170	The pattern for ``STrr`` is matched, so elsewhere in ``XXXGenDAGISel.inc``,
				1171	code for ``STrr`` is created for ``Select_ISD_STORE``. The ``Emit_22`` method
				1172	is also generated in ``XXXGenDAGISel.inc`` to complete the processing of this
				1173	instruction.
				1174
				1175	.. code-block:: c++
				1176
				1177	SDNode *Select_ISD_STORE(const SDValue &N) {
				1178	SDValue Chain = N.getOperand(0);
				1179	if (Predicate_store(N.getNode())) {
				1180	SDValue N1 = N.getOperand(1);
				1181	SDValue N2 = N.getOperand(2);
				1182	SDValue CPTmp0;
				1183	SDValue CPTmp1;
				1184
				1185	// Pattern: (st:void IntRegs:i32:$src,
				1186	// ADDRrr:i32:$addr)<<P:Predicate_store>>
				1187	// Emits: (STrr:void ADDRrr:i32:$addr, IntRegs:i32:$src)
				1188	// Pattern complexity = 13 cost = 1 size = 0
				1189	if (SelectADDRrr(N, N2, CPTmp0, CPTmp1) &&
				1190	N1.getNode()->getValueType(0) == MVT::i32 &&
				1191	N2.getNode()->getValueType(0) == MVT::i32) {
				1192	return Emit_22(N, SP::STrr, CPTmp0, CPTmp1);
				1193	}
				1194	...
				1195
				1196	The SelectionDAG Legalize Phase
				1197	-------------------------------
				1198
				1199	The Legalize phase converts a DAG to use types and operations that are natively
				1200	supported by the target. For natively unsupported types and operations, you
				1201	need to add code to the target-specific ``XXXTargetLowering`` implementation to
				1202	convert unsupported types and operations to supported ones.
				1203
				1204	In the constructor for the ``XXXTargetLowering`` class, first use the
				1205	``addRegisterClass`` method to specify which types are supported and which
				1206	register classes are associated with them. The code for the register classes
				1207	are generated by TableGen from ``XXXRegisterInfo.td`` and placed in
				1208	``XXXGenRegisterInfo.h.inc``. For example, the implementation of the
				1209	constructor for the SparcTargetLowering class (in ``SparcISelLowering.cpp``)
				1210	starts with the following code:
				1211
				1212	.. code-block:: c++
				1213
				1214	addRegisterClass(MVT::i32, SP::IntRegsRegisterClass);
				1215	addRegisterClass(MVT::f32, SP::FPRegsRegisterClass);
				1216	addRegisterClass(MVT::f64, SP::DFPRegsRegisterClass);
				1217
				1218	You should examine the node types in the ``ISD`` namespace
				1219	(``include/llvm/CodeGen/SelectionDAGNodes.h``) and determine which operations
				1220	the target natively supports. For operations that do not have native
				1221	support, add a callback to the constructor for the ``XXXTargetLowering`` class,
				1222	so the instruction selection process knows what to do. The ``TargetLowering``
				1223	class callback methods (declared in ``llvm/Target/TargetLowering.h``) are:
				1224
				1225	* ``setOperationAction`` --- General operation.
				1226	* ``setLoadExtAction`` --- Load with extension.
				1227	* ``setTruncStoreAction`` --- Truncating store.
				1228	* ``setIndexedLoadAction`` --- Indexed load.
				1229	* ``setIndexedStoreAction`` --- Indexed store.
				1230	* ``setConvertAction`` --- Type conversion.
				1231	* ``setCondCodeAction`` --- Support for a given condition code.
				1232
				1233	Note: on older releases, ``setLoadXAction`` is used instead of
				1234	``setLoadExtAction``. Also, on older releases, ``setCondCodeAction`` may not
				1235	be supported. Examine your release to see what methods are specifically
				1236	supported.
				1237
				1238	These callbacks are used to determine that an operation does or does not work
				1239	with a specified type (or types). And in all cases, the third parameter is a
				1240	``LegalAction`` type enum value: ``Promote``, ``Expand``, ``Custom``, or
				1241	``Legal``. ``SparcISelLowering.cpp`` contains examples of all four
				1242	``LegalAction`` values.
				1243
				1244	Promote
				1245	^^^^^^^
				1246
				1247	For an operation without native support for a given type, the specified type
				1248	may be promoted to a larger type that is supported. For example, SPARC does
				1249	not support a sign-extending load for Boolean values (``i1`` type), so in
				1250	``SparcISelLowering.cpp`` the third parameter below, ``Promote``, changes
				1251	``i1`` type values to a large type before loading.
				1252
				1253	.. code-block:: c++
				1254
				1255	setLoadExtAction(ISD::SEXTLOAD, MVT::i1, Promote);
				1256
				1257	Expand
				1258	^^^^^^
				1259
				1260	For a type without native support, a value may need to be broken down further,
				1261	rather than promoted. For an operation without native support, a combination
				1262	of other operations may be used to similar effect. In SPARC, the
				1263	floating-point sine and cosine trig operations are supported by expansion to
				1264	other operations, as indicated by the third parameter, ``Expand``, to
				1265	``setOperationAction``:
				1266
				1267	.. code-block:: c++
				1268
				1269	setOperationAction(ISD::FSIN, MVT::f32, Expand);
				1270	setOperationAction(ISD::FCOS, MVT::f32, Expand);
				1271
				1272	Custom
				1273	^^^^^^
				1274
				1275	For some operations, simple type promotion or operation expansion may be
				1276	insufficient. In some cases, a special intrinsic function must be implemented.
				1277
				1278	For example, a constant value may require special treatment, or an operation
				1279	may require spilling and restoring registers in the stack and working with
				1280	register allocators.
				1281
				1282	As seen in ``SparcISelLowering.cpp`` code below, to perform a type conversion
				1283	from a floating point value to a signed integer, first the
				1284	``setOperationAction`` should be called with ``Custom`` as the third parameter:
				1285
				1286	.. code-block:: c++
				1287
				1288	setOperationAction(ISD::FP_TO_SINT, MVT::i32, Custom);
				1289
				1290	In the ``LowerOperation`` method, for each ``Custom`` operation, a case
				1291	statement should be added to indicate what function to call. In the following
				1292	code, an ``FP_TO_SINT`` opcode will call the ``LowerFP_TO_SINT`` method:
				1293
				1294	.. code-block:: c++
				1295
				1296	SDValue SparcTargetLowering::LowerOperation(SDValue Op, SelectionDAG &DAG) {
				1297	switch (Op.getOpcode()) {
				1298	case ISD::FP_TO_SINT: return LowerFP_TO_SINT(Op, DAG);
				1299	...
				1300	}
				1301	}
				1302
				1303	Finally, the ``LowerFP_TO_SINT`` method is implemented, using an FP register to
				1304	convert the floating-point value to an integer.
				1305
				1306	.. code-block:: c++
				1307
				1308	static SDValue LowerFP_TO_SINT(SDValue Op, SelectionDAG &DAG) {
				1309	assert(Op.getValueType() == MVT::i32);
				1310	Op = DAG.getNode(SPISD::FTOI, MVT::f32, Op.getOperand(0));
				1311	return DAG.getNode(ISD::BITCAST, MVT::i32, Op);
				1312	}
				1313
				1314	Legal
				1315	^^^^^
				1316
				1317	The ``Legal`` ``LegalizeAction`` enum value simply indicates that an operation
				1318	is natively supported. ``Legal`` represents the default condition, so it
				1319	is rarely used. In ``SparcISelLowering.cpp``, the action for ``CTPOP`` (an
				1320	operation to count the bits set in an integer) is natively supported only for
				1321	SPARC v9. The following code enables the ``Expand`` conversion technique for
				1322	non-v9 SPARC implementations.
				1323
				1324	.. code-block:: c++
				1325
				1326	setOperationAction(ISD::CTPOP, MVT::i32, Expand);
				1327	...
				1328	if (TM.getSubtarget<SparcSubtarget>().isV9())
				1329	setOperationAction(ISD::CTPOP, MVT::i32, Legal);
				1330
				1331	Calling Conventions
				1332	-------------------
				1333
				1334	To support target-specific calling conventions, ``XXXGenCallingConv.td`` uses
				1335	interfaces (such as ``CCIfType`` and ``CCAssignToReg``) that are defined in
				1336	``lib/Target/TargetCallingConv.td``. TableGen can take the target descriptor
				1337	file ``XXXGenCallingConv.td`` and generate the header file
				1338	``XXXGenCallingConv.inc``, which is typically included in
				1339	``XXXISelLowering.cpp``. You can use the interfaces in
				1340	``TargetCallingConv.td`` to specify:
				1341
				1342	* The order of parameter allocation.
				1343
				1344	* Where parameters and return values are placed (that is, on the stack or in
				1345	registers).
				1346
				1347	* Which registers may be used.
				1348
				1349	* Whether the caller or callee unwinds the stack.
				1350
				1351	The following example demonstrates the use of the ``CCIfType`` and
				1352	``CCAssignToReg`` interfaces. If the ``CCIfType`` predicate is true (that is,
				1353	if the current argument is of type ``f32`` or ``f64``), then the action is
				1354	performed. In this case, the ``CCAssignToReg`` action assigns the argument
				1355	value to the first available register: either ``R0`` or ``R1``.
				1356
				1357	.. code-block:: llvm
				1358
				1359	CCIfType<[f32,f64], CCAssignToReg<[R0, R1]>>
				1360
				1361	``SparcCallingConv.td`` contains definitions for a target-specific return-value
				1362	calling convention (``RetCC_Sparc32``) and a basic 32-bit C calling convention
				1363	(``CC_Sparc32``). The definition of ``RetCC_Sparc32`` (shown below) indicates
				1364	which registers are used for specified scalar return types. A single-precision
				1365	float is returned to register ``F0``, and a double-precision float goes to
				1366	register ``D0``. A 32-bit integer is returned in register ``I0`` or ``I1``.
				1367
				1368	.. code-block:: llvm
				1369
				1370	def RetCC_Sparc32 : CallingConv<[
				1371	CCIfType<[i32], CCAssignToReg<[I0, I1]>>,
				1372	CCIfType<[f32], CCAssignToReg<[F0]>>,
				1373	CCIfType<[f64], CCAssignToReg<[D0]>>
				1374	]>;
				1375
				1376	The definition of ``CC_Sparc32`` in ``SparcCallingConv.td`` introduces
				1377	``CCAssignToStack``, which assigns the value to a stack slot with the specified
				1378	size and alignment. In the example below, the first parameter, 4, indicates
				1379	the size of the slot, and the second parameter, also 4, indicates the stack
				1380	alignment along 4-byte units. (Special cases: if size is zero, then the ABI
				1381	size is used; if alignment is zero, then the ABI alignment is used.)
				1382
				1383	.. code-block:: llvm
				1384
				1385	def CC_Sparc32 : CallingConv<[
				1386	// All arguments get passed in integer registers if there is space.
				1387	CCIfType<[i32, f32, f64], CCAssignToReg<[I0, I1, I2, I3, I4, I5]>>,
				1388	CCAssignToStack<4, 4>
				1389	]>;
				1390
				1391	``CCDelegateTo`` is another commonly used interface, which tries to find a
				1392	specified sub-calling convention, and, if a match is found, it is invoked. In
				1393	the following example (in ``X86CallingConv.td``), the definition of
				1394	``RetCC_X86_32_C`` ends with ``CCDelegateTo``. After the current value is
				1395	assigned to the register ``ST0`` or ``ST1``, the ``RetCC_X86Common`` is
				1396	invoked.
				1397
				1398	.. code-block:: llvm
				1399
				1400	def RetCC_X86_32_C : CallingConv<[
				1401	CCIfType<[f32], CCAssignToReg<[ST0, ST1]>>,
				1402	CCIfType<[f64], CCAssignToReg<[ST0, ST1]>>,
				1403	CCDelegateTo<RetCC_X86Common>
				1404	]>;
				1405
				1406	``CCIfCC`` is an interface that attempts to match the given name to the current
				1407	calling convention. If the name identifies the current calling convention,
				1408	then a specified action is invoked. In the following example (in
				1409	``X86CallingConv.td``), if the ``Fast`` calling convention is in use, then
				1410	``RetCC_X86_32_Fast`` is invoked. If the ``SSECall`` calling convention is in
				1411	use, then ``RetCC_X86_32_SSE`` is invoked.
				1412
				1413	.. code-block:: llvm
				1414
				1415	def RetCC_X86_32 : CallingConv<[
				1416	CCIfCC<"CallingConv::Fast", CCDelegateTo<RetCC_X86_32_Fast>>,
				1417	CCIfCC<"CallingConv::X86_SSECall", CCDelegateTo<RetCC_X86_32_SSE>>,
				1418	CCDelegateTo<RetCC_X86_32_C>
				1419	]>;
				1420
				1421	Other calling convention interfaces include:
				1422
				1423	* ``CCIf <predicate, action>`` --- If the predicate matches, apply the action.
				1424
				1425	* ``CCIfInReg <action>`` --- If the argument is marked with the "``inreg``"
				1426	attribute, then apply the action.
				1427
				1428	* ``CCIfNest <action>`` --- If the argument is marked with the "``nest``"
				1429	attribute, then apply the action.
				1430
				1431	* ``CCIfNotVarArg <action>`` --- If the current function does not take a
				1432	variable number of arguments, apply the action.
				1433
				1434	* ``CCAssignToRegWithShadow <registerList, shadowList>`` --- similar to
				1435	``CCAssignToReg``, but with a shadow list of registers.
				1436
				1437	* ``CCPassByVal <size, align>`` --- Assign value to a stack slot with the
				1438	minimum specified size and alignment.
				1439
				1440	* ``CCPromoteToType <type>`` --- Promote the current value to the specified
				1441	type.
				1442
				1443	* ``CallingConv <[actions]>`` --- Define each calling convention that is
				1444	supported.
				1445
				1446	Assembly Printer
				1447	================
				1448
				1449	During the code emission stage, the code generator may utilize an LLVM pass to
				1450	produce assembly output. To do this, you want to implement the code for a
				1451	printer that converts LLVM IR to a GAS-format assembly language for your target
				1452	machine, using the following steps:
				1453
				1454	* Define all the assembly strings for your target, adding them to the
				1455	instructions defined in the ``XXXInstrInfo.td`` file. (See
				1456	:ref:`instruction-set`.) TableGen will produce an output file
				1457	(``XXXGenAsmWriter.inc``) with an implementation of the ``printInstruction``
				1458	method for the ``XXXAsmPrinter`` class.
				1459
				1460	* Write ``XXXTargetAsmInfo.h``, which contains the bare-bones declaration of
				1461	the ``XXXTargetAsmInfo`` class (a subclass of ``TargetAsmInfo``).
				1462
				1463	* Write ``XXXTargetAsmInfo.cpp``, which contains target-specific values for
				1464	``TargetAsmInfo`` properties and sometimes new implementations for methods.
				1465
				1466	* Write ``XXXAsmPrinter.cpp``, which implements the ``AsmPrinter`` class that
				1467	performs the LLVM-to-assembly conversion.
				1468
				1469	The code in ``XXXTargetAsmInfo.h`` is usually a trivial declaration of the
				1470	``XXXTargetAsmInfo`` class for use in ``XXXTargetAsmInfo.cpp``. Similarly,
				1471	``XXXTargetAsmInfo.cpp`` usually has a few declarations of ``XXXTargetAsmInfo``
				1472	replacement values that override the default values in ``TargetAsmInfo.cpp``.
				1473	For example in ``SparcTargetAsmInfo.cpp``:
				1474
				1475	.. code-block:: c++
				1476
				1477	SparcTargetAsmInfo::SparcTargetAsmInfo(const SparcTargetMachine &TM) {
				1478	Data16bitsDirective = "\t.half\t";
				1479	Data32bitsDirective = "\t.word\t";
				1480	Data64bitsDirective = 0; // .xword is only supported by V9.
				1481	ZeroDirective = "\t.skip\t";
				1482	CommentString = "!";
				1483	ConstantPoolSection = "\t.section \".rodata\",#alloc\n";
				1484	}
				1485
				1486	The X86 assembly printer implementation (``X86TargetAsmInfo``) is an example
				1487	where the target specific ``TargetAsmInfo`` class uses an overridden methods:
				1488	``ExpandInlineAsm``.
				1489
				1490	A target-specific implementation of ``AsmPrinter`` is written in
				1491	``XXXAsmPrinter.cpp``, which implements the ``AsmPrinter`` class that converts
				1492	the LLVM to printable assembly. The implementation must include the following
				1493	headers that have declarations for the ``AsmPrinter`` and
				1494	``MachineFunctionPass`` classes. The ``MachineFunctionPass`` is a subclass of
				1495	``FunctionPass``.
				1496
				1497	.. code-block:: c++
				1498
				1499	#include "llvm/CodeGen/AsmPrinter.h"
				1500	#include "llvm/CodeGen/MachineFunctionPass.h"
				1501
				1502	As a ``FunctionPass``, ``AsmPrinter`` first calls ``doInitialization`` to set
				1503	up the ``AsmPrinter``. In ``SparcAsmPrinter``, a ``Mangler`` object is
				1504	instantiated to process variable names.
				1505
				1506	In ``XXXAsmPrinter.cpp``, the ``runOnMachineFunction`` method (declared in
				1507	``MachineFunctionPass``) must be implemented for ``XXXAsmPrinter``. In
				1508	``MachineFunctionPass``, the ``runOnFunction`` method invokes
				1509	``runOnMachineFunction``. Target-specific implementations of
				1510	``runOnMachineFunction`` differ, but generally do the following to process each
				1511	machine function:
				1512
				1513	* Call ``SetupMachineFunction`` to perform initialization.
				1514
				1515	* Call ``EmitConstantPool`` to print out (to the output stream) constants which
				1516	have been spilled to memory.
				1517
				1518	* Call ``EmitJumpTableInfo`` to print out jump tables used by the current
				1519	function.
				1520
				1521	* Print out the label for the current function.
				1522
				1523	* Print out the code for the function, including basic block labels and the
				1524	assembly for the instruction (using ``printInstruction``)
				1525
				1526	The ``XXXAsmPrinter`` implementation must also include the code generated by
				1527	TableGen that is output in the ``XXXGenAsmWriter.inc`` file. The code in
				1528	``XXXGenAsmWriter.inc`` contains an implementation of the ``printInstruction``
				1529	method that may call these methods:
				1530
				1531	* ``printOperand``
				1532	* ``printMemOperand``
				1533	* ``printCCOperand`` (for conditional statements)
				1534	* ``printDataDirective``
				1535	* ``printDeclare``
				1536	* ``printImplicitDef``
				1537	* ``printInlineAsm``
				1538
				1539	The implementations of ``printDeclare``, ``printImplicitDef``,
				1540	``printInlineAsm``, and ``printLabel`` in ``AsmPrinter.cpp`` are generally
				1541	adequate for printing assembly and do not need to be overridden.
				1542
				1543	The ``printOperand`` method is implemented with a long ``switch``/``case``
				1544	statement for the type of operand: register, immediate, basic block, external
				1545	symbol, global address, constant pool index, or jump table index. For an
				1546	instruction with a memory address operand, the ``printMemOperand`` method
				1547	should be implemented to generate the proper output. Similarly,
				1548	``printCCOperand`` should be used to print a conditional operand.
				1549
				1550	``doFinalization`` should be overridden in ``XXXAsmPrinter``, and it should be
				1551	called to shut down the assembly printer. During ``doFinalization``, global
				1552	variables and constants are printed to output.
				1553
				1554	Subtarget Support
				1555	=================
				1556
				1557	Subtarget support is used to inform the code generation process of instruction
				1558	set variations for a given chip set. For example, the LLVM SPARC
				1559	implementation provided covers three major versions of the SPARC microprocessor
				1560	architecture: Version 8 (V8, which is a 32-bit architecture), Version 9 (V9, a
				1561	64-bit architecture), and the UltraSPARC architecture. V8 has 16
				1562	double-precision floating-point registers that are also usable as either 32
				1563	single-precision or 8 quad-precision registers. V8 is also purely big-endian.
				1564	V9 has 32 double-precision floating-point registers that are also usable as 16
				1565	quad-precision registers, but cannot be used as single-precision registers.
				1566	The UltraSPARC architecture combines V9 with UltraSPARC Visual Instruction Set
				1567	extensions.
				1568
				1569	If subtarget support is needed, you should implement a target-specific
				1570	``XXXSubtarget`` class for your architecture. This class should process the
				1571	command-line options ``-mcpu=`` and ``-mattr=``.
				1572
				1573	TableGen uses definitions in the ``Target.td`` and ``Sparc.td`` files to
				1574	generate code in ``SparcGenSubtarget.inc``. In ``Target.td``, shown below, the
				1575	``SubtargetFeature`` interface is defined. The first 4 string parameters of
				1576	the ``SubtargetFeature`` interface are a feature name, an attribute set by the
				1577	feature, the value of the attribute, and a description of the feature. (The
				1578	fifth parameter is a list of features whose presence is implied, and its
				1579	default value is an empty array.)
				1580
				1581	.. code-block:: llvm
				1582
				1583	class SubtargetFeature<string n, string a, string v, string d,
				1584	list<SubtargetFeature> i = []> {
				1585	string Name = n;
				1586	string Attribute = a;
				1587	string Value = v;
				1588	string Desc = d;
				1589	list<SubtargetFeature> Implies = i;
				1590	}
				1591
				1592	In the ``Sparc.td`` file, the ``SubtargetFeature`` is used to define the
				1593	following features.
				1594
				1595	.. code-block:: llvm
				1596
				1597	def FeatureV9 : SubtargetFeature<"v9", "IsV9", "true",
				1598	"Enable SPARC-V9 instructions">;
				1599	def FeatureV8Deprecated : SubtargetFeature<"deprecated-v8",
				1600	"V8DeprecatedInsts", "true",
				1601	"Enable deprecated V8 instructions in V9 mode">;
				1602	def FeatureVIS : SubtargetFeature<"vis", "IsVIS", "true",
				1603	"Enable UltraSPARC Visual Instruction Set extensions">;
				1604
				1605	Elsewhere in ``Sparc.td``, the ``Proc`` class is defined and then is used to
				1606	define particular SPARC processor subtypes that may have the previously
				1607	described features.
				1608
				1609	.. code-block:: llvm
				1610
				1611	class Proc<string Name, list<SubtargetFeature> Features>
				1612	: Processor<Name, NoItineraries, Features>;
				1613
				1614	def : Proc<"generic", []>;
				1615	def : Proc<"v8", []>;
				1616	def : Proc<"supersparc", []>;
				1617	def : Proc<"sparclite", []>;
				1618	def : Proc<"f934", []>;
				1619	def : Proc<"hypersparc", []>;
				1620	def : Proc<"sparclite86x", []>;
				1621	def : Proc<"sparclet", []>;
				1622	def : Proc<"tsc701", []>;
				1623	def : Proc<"v9", [FeatureV9]>;
				1624	def : Proc<"ultrasparc", [FeatureV9, FeatureV8Deprecated]>;
				1625	def : Proc<"ultrasparc3", [FeatureV9, FeatureV8Deprecated]>;
				1626	def : Proc<"ultrasparc3-vis", [FeatureV9, FeatureV8Deprecated, FeatureVIS]>;
				1627
				1628	From ``Target.td`` and ``Sparc.td`` files, the resulting
				1629	``SparcGenSubtarget.inc`` specifies enum values to identify the features,
				1630	arrays of constants to represent the CPU features and CPU subtypes, and the
				1631	``ParseSubtargetFeatures`` method that parses the features string that sets
				1632	specified subtarget options. The generated ``SparcGenSubtarget.inc`` file
				1633	should be included in the ``SparcSubtarget.cpp``. The target-specific
				1634	implementation of the ``XXXSubtarget`` method should follow this pseudocode:
				1635
				1636	.. code-block:: c++
				1637
				1638	XXXSubtarget::XXXSubtarget(const Module &M, const std::string &FS) {
				1639	// Set the default features
				1640	// Determine default and user specified characteristics of the CPU
				1641	// Call ParseSubtargetFeatures(FS, CPU) to parse the features string
				1642	// Perform any additional operations
				1643	}
				1644
				1645	JIT Support
				1646	===========
				1647
				1648	The implementation of a target machine optionally includes a Just-In-Time (JIT)
				1649	code generator that emits machine code and auxiliary structures as binary
				1650	output that can be written directly to memory. To do this, implement JIT code
				1651	generation by performing the following steps:
				1652
				1653	* Write an ``XXXCodeEmitter.cpp`` file that contains a machine function pass
				1654	that transforms target-machine instructions into relocatable machine
				1655	code.
				1656
				1657	* Write an ``XXXJITInfo.cpp`` file that implements the JIT interfaces for
				1658	target-specific code-generation activities, such as emitting machine code and
				1659	stubs.
				1660
				1661	* Modify ``XXXTargetMachine`` so that it provides a ``TargetJITInfo`` object
				1662	through its ``getJITInfo`` method.
				1663
				1664	There are several different approaches to writing the JIT support code. For
				1665	instance, TableGen and target descriptor files may be used for creating a JIT
				1666	code generator, but are not mandatory. For the Alpha and PowerPC target
				1667	machines, TableGen is used to generate ``XXXGenCodeEmitter.inc``, which
				1668	contains the binary coding of machine instructions and the
				1669	``getBinaryCodeForInstr`` method to access those codes. Other JIT
				1670	implementations do not.
				1671
				1672	Both ``XXXJITInfo.cpp`` and ``XXXCodeEmitter.cpp`` must include the
				1673	``llvm/CodeGen/MachineCodeEmitter.h`` header file that defines the
				1674	``MachineCodeEmitter`` class containing code for several callback functions
				1675	that write data (in bytes, words, strings, etc.) to the output stream.
				1676
				1677	Machine Code Emitter
				1678	--------------------
				1679
				1680	In ``XXXCodeEmitter.cpp``, a target-specific of the ``Emitter`` class is
				1681	implemented as a function pass (subclass of ``MachineFunctionPass``). The
				1682	target-specific implementation of ``runOnMachineFunction`` (invoked by
				1683	``runOnFunction`` in ``MachineFunctionPass``) iterates through the
				1684	``MachineBasicBlock`` calls ``emitInstruction`` to process each instruction and
				1685	emit binary code. ``emitInstruction`` is largely implemented with case
				1686	statements on the instruction types defined in ``XXXInstrInfo.h``. For
				1687	example, in ``X86CodeEmitter.cpp``, the ``emitInstruction`` method is built
				1688	around the following ``switch``/``case`` statements:
				1689
				1690	.. code-block:: c++
				1691
				1692	switch (Desc->TSFlags & X86::FormMask) {
				1693	case X86II::Pseudo: // for not yet implemented instructions
				1694	... // or pseudo-instructions
				1695	break;
				1696	case X86II::RawFrm: // for instructions with a fixed opcode value
				1697	...
				1698	break;
				1699	case X86II::AddRegFrm: // for instructions that have one register operand
				1700	... // added to their opcode
				1701	break;
				1702	case X86II::MRMDestReg:// for instructions that use the Mod/RM byte
				1703	... // to specify a destination (register)
				1704	break;
				1705	case X86II::MRMDestMem:// for instructions that use the Mod/RM byte
				1706	... // to specify a destination (memory)
				1707	break;
				1708	case X86II::MRMSrcReg: // for instructions that use the Mod/RM byte
				1709	... // to specify a source (register)
				1710	break;
				1711	case X86II::MRMSrcMem: // for instructions that use the Mod/RM byte
				1712	... // to specify a source (memory)
				1713	break;
				1714	case X86II::MRM0r: case X86II::MRM1r: // for instructions that operate on
				1715	case X86II::MRM2r: case X86II::MRM3r: // a REGISTER r/m operand and
				1716	case X86II::MRM4r: case X86II::MRM5r: // use the Mod/RM byte and a field
				1717	case X86II::MRM6r: case X86II::MRM7r: // to hold extended opcode data
				1718	...
				1719	break;
				1720	case X86II::MRM0m: case X86II::MRM1m: // for instructions that operate on
				1721	case X86II::MRM2m: case X86II::MRM3m: // a MEMORY r/m operand and
				1722	case X86II::MRM4m: case X86II::MRM5m: // use the Mod/RM byte and a field
				1723	case X86II::MRM6m: case X86II::MRM7m: // to hold extended opcode data
				1724	...
				1725	break;
				1726	case X86II::MRMInitReg: // for instructions whose source and
				1727	... // destination are the same register
				1728	break;
				1729	}
				1730
				1731	The implementations of these case statements often first emit the opcode and
				1732	then get the operand(s). Then depending upon the operand, helper methods may
				1733	be called to process the operand(s). For example, in ``X86CodeEmitter.cpp``,
				1734	for the ``X86II::AddRegFrm`` case, the first data emitted (by ``emitByte``) is
				1735	the opcode added to the register operand. Then an object representing the
				1736	machine operand, ``MO1``, is extracted. The helper methods such as
				1737	``isImmediate``, ``isGlobalAddress``, ``isExternalSymbol``,
				1738	``isConstantPoolIndex``, and ``isJumpTableIndex`` determine the operand type.
				1739	(``X86CodeEmitter.cpp`` also has private methods such as ``emitConstant``,
				1740	``emitGlobalAddress``, ``emitExternalSymbolAddress``, ``emitConstPoolAddress``,
				1741	and ``emitJumpTableAddress`` that emit the data into the output stream.)
				1742
				1743	.. code-block:: c++
				1744
				1745	case X86II::AddRegFrm:
				1746	MCE.emitByte(BaseOpcode + getX86RegNum(MI.getOperand(CurOp++).getReg()));
				1747
				1748	if (CurOp != NumOps) {
				1749	const MachineOperand &MO1 = MI.getOperand(CurOp++);
				1750	unsigned Size = X86InstrInfo::sizeOfImm(Desc);
				1751	if (MO1.isImmediate())
				1752	emitConstant(MO1.getImm(), Size);
				1753	else {
				1754	unsigned rt = Is64BitMode ? X86::reloc_pcrel_word
				1755	: (IsPIC ? X86::reloc_picrel_word : X86::reloc_absolute_word);
				1756	if (Opcode == X86::MOV64ri)
				1757	rt = X86::reloc_absolute_dword; // FIXME: add X86II flag?
				1758	if (MO1.isGlobalAddress()) {
				1759	bool NeedStub = isa<Function>(MO1.getGlobal());
				1760	bool isLazy = gvNeedsLazyPtr(MO1.getGlobal());
				1761	emitGlobalAddress(MO1.getGlobal(), rt, MO1.getOffset(), 0,
				1762	NeedStub, isLazy);
				1763	} else if (MO1.isExternalSymbol())
				1764	emitExternalSymbolAddress(MO1.getSymbolName(), rt);
				1765	else if (MO1.isConstantPoolIndex())
				1766	emitConstPoolAddress(MO1.getIndex(), rt);
				1767	else if (MO1.isJumpTableIndex())
				1768	emitJumpTableAddress(MO1.getIndex(), rt);
				1769	}
				1770	}
				1771	break;
				1772
				1773	In the previous example, ``XXXCodeEmitter.cpp`` uses the variable ``rt``, which
				1774	is a ``RelocationType`` enum that may be used to relocate addresses (for
				1775	example, a global address with a PIC base offset). The ``RelocationType`` enum
				1776	for that target is defined in the short target-specific ``XXXRelocations.h``
				1777	file. The ``RelocationType`` is used by the ``relocate`` method defined in
				1778	``XXXJITInfo.cpp`` to rewrite addresses for referenced global symbols.
				1779
				1780	For example, ``X86Relocations.h`` specifies the following relocation types for
				1781	the X86 addresses. In all four cases, the relocated value is added to the
				1782	value already in memory. For ``reloc_pcrel_word`` and ``reloc_picrel_word``,
				1783	there is an additional initial adjustment.
				1784
				1785	.. code-block:: c++
				1786
				1787	enum RelocationType {
				1788	reloc_pcrel_word = 0, // add reloc value after adjusting for the PC loc
				1789	reloc_picrel_word = 1, // add reloc value after adjusting for the PIC base
				1790	reloc_absolute_word = 2, // absolute relocation; no additional adjustment
				1791	reloc_absolute_dword = 3 // absolute relocation; no additional adjustment
				1792	};
				1793
				1794	Target JIT Info
				1795	---------------
				1796
				1797	``XXXJITInfo.cpp`` implements the JIT interfaces for target-specific
				1798	code-generation activities, such as emitting machine code and stubs. At
				1799	minimum, a target-specific version of ``XXXJITInfo`` implements the following:
				1800
				1801	* ``getLazyResolverFunction`` --- Initializes the JIT, gives the target a
				1802	function that is used for compilation.
				1803
				1804	* ``emitFunctionStub`` --- Returns a native function with a specified address
				1805	for a callback function.
				1806
				1807	* ``relocate`` --- Changes the addresses of referenced globals, based on
				1808	relocation types.
				1809
				1810	* Callback function that are wrappers to a function stub that is used when the
				1811	real target is not initially known.
				1812
				1813	``getLazyResolverFunction`` is generally trivial to implement. It makes the
				1814	incoming parameter as the global ``JITCompilerFunction`` and returns the
				1815	callback function that will be used a function wrapper. For the Alpha target
				1816	(in ``AlphaJITInfo.cpp``), the ``getLazyResolverFunction`` implementation is
				1817	simply:
				1818
				1819	.. code-block:: c++
				1820
				1821	TargetJITInfo::LazyResolverFn AlphaJITInfo::getLazyResolverFunction(
				1822	JITCompilerFn F) {
				1823	JITCompilerFunction = F;
				1824	return AlphaCompilationCallback;
				1825	}
				1826
				1827	For the X86 target, the ``getLazyResolverFunction`` implementation is a little
				1828	more complicated, because it returns a different callback function for
				1829	processors with SSE instructions and XMM registers.
				1830
				1831	The callback function initially saves and later restores the callee register
				1832	values, incoming arguments, and frame and return address. The callback
				1833	function needs low-level access to the registers or stack, so it is typically
				1834	implemented with assembler.
				1835