Blame - llvm/docs/ProgrammersManual.rst - toolchain/llvm-project

blob: ffc022eef168d8f96e0006125b140cda93f29958 [file] [log] [blame]

Sean Silva	beb15ca	2012-12-04 03:20:08 +0000	[diff] [blame]	1	========================
				2	LLVM Programmer's Manual
				3	========================
				4
				5	.. contents::
				6	:local:
				7
				8	.. warning::
Chris Lattner	045a73e	2013-01-10 21:24:04 +0000	[diff] [blame]	9	This is always a work in progress.
Sean Silva	beb15ca	2012-12-04 03:20:08 +0000	[diff] [blame]	10
				11	.. _introduction:
				12
				13	Introduction
				14	============
				15
				16	This document is meant to highlight some of the important classes and interfaces
				17	available in the LLVM source-base. This manual is not intended to explain what
				18	LLVM is, how it works, and what LLVM code looks like. It assumes that you know
				19	the basics of LLVM and are interested in writing transformations or otherwise
				20	analyzing or manipulating the code.
				21
				22	This document should get you oriented so that you can find your way in the
				23	continuously growing source code that makes up the LLVM infrastructure. Note
				24	that this manual is not intended to serve as a replacement for reading the
				25	source code, so if you think there should be a method in one of these classes to
				26	do something, but it's not listed, check the source. Links to the `doxygen
				27	<http://llvm.org/doxygen/>`__ sources are provided to make this as easy as
				28	possible.
				29
				30	The first section of this document describes general information that is useful
				31	to know when working in the LLVM infrastructure, and the second describes the
				32	Core LLVM classes. In the future this manual will be extended with information
				33	describing how to use extension libraries, such as dominator information, CFG
				34	traversal routines, and useful utilities like the ``InstVisitor`` (`doxygen
				35	<http://llvm.org/doxygen/InstVisitor_8h-source.html>`__) template.
				36
				37	.. _general:
				38
				39	General Information
				40	===================
				41
				42	This section contains general information that is useful if you are working in
				43	the LLVM source-base, but that isn't specific to any particular API.
				44
				45	.. _stl:
				46
				47	The C++ Standard Template Library
				48	---------------------------------
				49
				50	LLVM makes heavy use of the C++ Standard Template Library (STL), perhaps much
				51	more than you are used to, or have seen before. Because of this, you might want
				52	to do a little background reading in the techniques used and capabilities of the
				53	library. There are many good pages that discuss the STL, and several books on
				54	the subject that you can get, so it will not be discussed in this document.
				55
				56	Here are some useful links:
				57
Sean Silva	4b58785	2012-12-04 03:30:36 +0000	[diff] [blame]	58	#. `cppreference.com
				59	<http://en.cppreference.com/w/>`_ - an excellent
Sean Silva	beb15ca	2012-12-04 03:20:08 +0000	[diff] [blame]	60	reference for the STL and other parts of the standard C++ library.
				61
				62	#. `C++ In a Nutshell <http://www.tempest-sw.com/cpp/>`_ - This is an O'Reilly
				63	book in the making. It has a decent Standard Library Reference that rivals
				64	Dinkumware's, and is unfortunately no longer free since the book has been
				65	published.
				66
				67	#. `C++ Frequently Asked Questions <http://www.parashift.com/c++-faq-lite/>`_.
				68
				69	#. `SGI's STL Programmer's Guide <http://www.sgi.com/tech/stl/>`_ - Contains a
				70	useful `Introduction to the STL
				71	<http://www.sgi.com/tech/stl/stl_introduction.html>`_.
				72
				73	#. `Bjarne Stroustrup's C++ Page
				74	<http://www.research.att.com/%7Ebs/C++.html>`_.
				75
				76	#. `Bruce Eckel's Thinking in C++, 2nd ed. Volume 2 Revision 4.0
Sean Silva	c454f07e3	2012-12-04 03:45:27 +0000	[diff] [blame]	77	(even better, get the book)
				78	<http://www.mindview.net/Books/TICPP/ThinkingInCPP2e.html>`_.
Sean Silva	beb15ca	2012-12-04 03:20:08 +0000	[diff] [blame]	79
Sean Silva	92a4489	2013-01-11 02:28:08 +0000	[diff] [blame]	80	You are also encouraged to take a look at the :doc:`LLVM Coding Standards
				81	<CodingStandards>` guide which focuses on how to write maintainable code more
Sean Silva	beb15ca	2012-12-04 03:20:08 +0000	[diff] [blame]	82	than where to put your curly braces.
				83
				84	.. _resources:
				85
				86	Other useful references
				87	-----------------------
				88
				89	#. `Using static and shared libraries across platforms
				90	<http://www.fortran-2000.com/ArnaudRecipes/sharedlib.html>`_
				91
				92	.. _apis:
				93
				94	Important and useful LLVM APIs
				95	==============================
				96
				97	Here we highlight some LLVM APIs that are generally useful and good to know
				98	about when writing transformations.
				99
				100	.. _isa:
				101
				102	The ``isa<>``, ``cast<>`` and ``dyn_cast<>`` templates
				103	------------------------------------------------------
				104
				105	The LLVM source-base makes extensive use of a custom form of RTTI. These
				106	templates have many similarities to the C++ ``dynamic_cast<>`` operator, but
				107	they don't have some drawbacks (primarily stemming from the fact that
				108	``dynamic_cast<>`` only works on classes that have a v-table). Because they are
				109	used so often, you must know what they do and how they work. All of these
				110	templates are defined in the ``llvm/Support/Casting.h`` (`doxygen
				111	<http://llvm.org/doxygen/Casting_8h-source.html>`__) file (note that you very
				112	rarely have to include this file directly).
				113
				114	``isa<>``:
				115	The ``isa<>`` operator works exactly like the Java "``instanceof``" operator.
				116	It returns true or false depending on whether a reference or pointer points to
				117	an instance of the specified class. This can be very useful for constraint
				118	checking of various sorts (example below).
				119
				120	``cast<>``:
				121	The ``cast<>`` operator is a "checked cast" operation. It converts a pointer
				122	or reference from a base class to a derived class, causing an assertion
				123	failure if it is not really an instance of the right type. This should be
				124	used in cases where you have some information that makes you believe that
				125	something is of the right type. An example of the ``isa<>`` and ``cast<>``
				126	template is:
				127
				128	.. code-block:: c++
				129
				130	static bool isLoopInvariant(const Value V, const Loop L) {
				131	if (isa<Constant>(V) \|\| isa<Argument>(V) \|\| isa<GlobalValue>(V))
				132	return true;
				133
				134	// Otherwise, it must be an instruction...
				135	return !L->contains(cast<Instruction>(V)->getParent());
				136	}
				137
				138	Note that you should not use an ``isa<>`` test followed by a ``cast<>``,
				139	for that use the ``dyn_cast<>`` operator.
				140
				141	``dyn_cast<>``:
				142	The ``dyn_cast<>`` operator is a "checking cast" operation. It checks to see
				143	if the operand is of the specified type, and if so, returns a pointer to it
				144	(this operator does not work with references). If the operand is not of the
				145	correct type, a null pointer is returned. Thus, this works very much like
				146	the ``dynamic_cast<>`` operator in C++, and should be used in the same
				147	circumstances. Typically, the ``dyn_cast<>`` operator is used in an ``if``
				148	statement or some other flow control statement like this:
				149
				150	.. code-block:: c++
				151
Piotr Padlewski	db8d7c8	2016-11-11 22:12:15 +0000	[diff] [blame]	152	if (auto *AI = dyn_cast<AllocationInst>(Val)) {
Sean Silva	beb15ca	2012-12-04 03:20:08 +0000	[diff] [blame]	153	// ...
				154	}
				155
				156	This form of the ``if`` statement effectively combines together a call to
				157	``isa<>`` and a call to ``cast<>`` into one statement, which is very
				158	convenient.
				159
				160	Note that the ``dyn_cast<>`` operator, like C++'s ``dynamic_cast<>`` or Java's
				161	``instanceof`` operator, can be abused. In particular, you should not use big
				162	chained ``if/then/else`` blocks to check for lots of different variants of
				163	classes. If you find yourself wanting to do this, it is much cleaner and more
				164	efficient to use the ``InstVisitor`` class to dispatch over the instruction
				165	type directly.
				166
				167	``cast_or_null<>``:
				168	The ``cast_or_null<>`` operator works just like the ``cast<>`` operator,
				169	except that it allows for a null pointer as an argument (which it then
				170	propagates). This can sometimes be useful, allowing you to combine several
				171	null checks into one.
				172
				173	``dyn_cast_or_null<>``:
				174	The ``dyn_cast_or_null<>`` operator works just like the ``dyn_cast<>``
				175	operator, except that it allows for a null pointer as an argument (which it
				176	then propagates). This can sometimes be useful, allowing you to combine
				177	several null checks into one.
				178
				179	These five templates can be used with any classes, whether they have a v-table
				180	or not. If you want to add support for these templates, see the document
Sean Silva	92a4489	2013-01-11 02:28:08 +0000	[diff] [blame]	181	:doc:`How to set up LLVM-style RTTI for your class hierarchy
				182	<HowToSetUpLLVMStyleRTTI>`
Sean Silva	beb15ca	2012-12-04 03:20:08 +0000	[diff] [blame]	183
				184	.. _string_apis:
				185
				186	Passing strings (the ``StringRef`` and ``Twine`` classes)
				187	---------------------------------------------------------
				188
				189	Although LLVM generally does not do much string manipulation, we do have several
				190	important APIs which take strings. Two important examples are the Value class
				191	-- which has names for instructions, functions, etc. -- and the ``StringMap``
				192	class which is used extensively in LLVM and Clang.
				193
				194	These are generic classes, and they need to be able to accept strings which may
				195	have embedded null characters. Therefore, they cannot simply take a ``const
				196	char *``, and taking a ``const std::string&`` requires clients to perform a heap
				197	allocation which is usually unnecessary. Instead, many LLVM APIs use a
				198	``StringRef`` or a ``const Twine&`` for passing strings efficiently.
				199
				200	.. _StringRef:
				201
				202	The ``StringRef`` class
				203	^^^^^^^^^^^^^^^^^^^^^^^^^^^^
				204
				205	The ``StringRef`` data type represents a reference to a constant string (a
				206	character array and a length) and supports the common operations available on
				207	``std::string``, but does not require heap allocation.
				208
				209	It can be implicitly constructed using a C style null-terminated string, an
				210	``std::string``, or explicitly with a character pointer and length. For
				211	example, the ``StringRef`` find function is declared as:
				212
				213	.. code-block:: c++
				214
				215	iterator find(StringRef Key);
				216
				217	and clients can call it using any one of:
				218
				219	.. code-block:: c++
				220
				221	Map.find("foo"); // Lookup "foo"
				222	Map.find(std::string("bar")); // Lookup "bar"
				223	Map.find(StringRef("\0baz", 4)); // Lookup "\0baz"
				224
				225	Similarly, APIs which need to return a string may return a ``StringRef``
				226	instance, which can be used directly or converted to an ``std::string`` using
				227	the ``str`` member function. See ``llvm/ADT/StringRef.h`` (`doxygen
				228	<http://llvm.org/doxygen/classllvm_1_1StringRef_8h-source.html>`__) for more
				229	information.
				230
				231	You should rarely use the ``StringRef`` class directly, because it contains
				232	pointers to external memory it is not generally safe to store an instance of the
				233	class (unless you know that the external storage will not be freed).
				234	``StringRef`` is small and pervasive enough in LLVM that it should always be
				235	passed by value.
				236
				237	The ``Twine`` class
				238	^^^^^^^^^^^^^^^^^^^
				239
				240	The ``Twine`` (`doxygen <http://llvm.org/doxygen/classllvm_1_1Twine.html>`__)
				241	class is an efficient way for APIs to accept concatenated strings. For example,
				242	a common LLVM paradigm is to name one instruction based on the name of another
				243	instruction with a suffix, for example:
				244
				245	.. code-block:: c++
				246
				247	New = CmpInst::Create(..., SO->getName() + ".cmp");
				248
				249	The ``Twine`` class is effectively a lightweight `rope
				250	<http://en.wikipedia.org/wiki/Rope_(computer_science)>`_ which points to
				251	temporary (stack allocated) objects. Twines can be implicitly constructed as
				252	the result of the plus operator applied to strings (i.e., a C strings, an
				253	``std::string``, or a ``StringRef``). The twine delays the actual concatenation
				254	of strings until it is actually required, at which point it can be efficiently
				255	rendered directly into a character array. This avoids unnecessary heap
				256	allocation involved in constructing the temporary results of string
				257	concatenation. See ``llvm/ADT/Twine.h`` (`doxygen
				258	<http://llvm.org/doxygen/Twine_8h_source.html>`__) and :ref:`here <dss_twine>`
				259	for more information.
				260
				261	As with a ``StringRef``, ``Twine`` objects point to external memory and should
				262	almost never be stored or mentioned directly. They are intended solely for use
				263	when defining a function which should be able to efficiently accept concatenated
				264	strings.
				265
Zachary Turner	11db264	2016-11-11 23:57:40 +0000	[diff] [blame]	266	.. _formatting_strings:
				267
				268	Formatting strings (the ``formatv`` function)
				269	---------------------------------------------
				270	While LLVM doesn't necessarily do a lot of string manipulation and parsing, it
				271	does do a lot of string formatting. From diagnostic messages, to llvm tool
				272	outputs such as ``llvm-readobj`` to printing verbose disassembly listings and
				273	LLDB runtime logging, the need for string formatting is pervasive.
				274
				275	The ``formatv`` is similar in spirit to ``printf``, but uses a different syntax
				276	which borrows heavily from Python and C#. Unlike ``printf`` it deduces the type
				277	to be formatted at compile time, so it does not need a format specifier such as
				278	``%d``. This reduces the mental overhead of trying to construct portable format
				279	strings, especially for platform-specific types like ``size_t`` or pointer types.
				280	Unlike both ``printf`` and Python, it additionally fails to compile if LLVM does
				281	not know how to format the type. These two properties ensure that the function
				282	is both safer and simpler to use than traditional formatting methods such as
				283	the ``printf`` family of functions.
				284
				285	Simple formatting
				286	^^^^^^^^^^^^^^^^^
				287
				288	A call to ``formatv`` involves a single format string consisting of 0 or more
				289	replacement sequences, followed by a variable length list of replacement values.
				290	A replacement sequence is a string of the form ``{N[[,align]:style]}``.
				291
				292	``N`` refers to the 0-based index of the argument from the list of replacement
				293	values. Note that this means it is possible to reference the same parameter
				294	multiple times, possibly with different style and/or alignment options, in any order.
				295
				296	``align`` is an optional string specifying the width of the field to format
				297	the value into, and the alignment of the value within the field. It is specified as
				298	an optional alignment style followed by a positive integral field width. The
				299	alignment style can be one of the characters ``-`` (left align), ``=`` (center align),
				300	or ``+`` (right align). The default is right aligned.
				301
				302	``style`` is an optional string consisting of a type specific that controls the
				303	formatting of the value. For example, to format a floating point value as a percentage,
				304	you can use the style option ``P``.
				305
				306	Custom formatting
				307	^^^^^^^^^^^^^^^^^
				308
				309	There are two ways to customize the formatting behavior for a type.
				310
				311	1. Provide a template specialization of ``llvm::format_provider<T>`` for your
				312	type ``T`` with the appropriate static format method.
				313
				314	.. code-block:: c++
				315
				316	namespace llvm {
				317	template<>
				318	struct format_provider<MyFooBar> {
				319	static void format(const MyFooBar &V, raw_ostream &Stream, StringRef Style) {
				320	// Do whatever is necessary to format `V` into `Stream`
				321	}
				322	};
				323	void foo() {
				324	MyFooBar X;
				325	std::string S = formatv("{0}", X);
				326	}
				327	}
				328
				329	This is a useful extensibility mechanism for adding support for formatting your own
				330	custom types with your own custom Style options. But it does not help when you want
				331	to extend the mechanism for formatting a type that the library already knows how to
				332	format. For that, we need something else.
				333
Pavel Labath	08c2e86	2016-12-15 09:40:27 +0000	[diff] [blame]	334	2. Provide a format adapter inheriting from ``llvm::FormatAdapter<T>``.
Zachary Turner	11db264	2016-11-11 23:57:40 +0000	[diff] [blame]	335
				336	.. code-block:: c++
				337
				338	namespace anything {
Pavel Labath	08c2e86	2016-12-15 09:40:27 +0000	[diff] [blame]	339	struct format_int_custom : public llvm::FormatAdapter<int> {
				340	explicit format_int_custom(int N) : llvm::FormatAdapter<int>(N) {}
				341	void format(llvm::raw_ostream &Stream, StringRef Style) override {
				342	// Do whatever is necessary to format ``this->Item`` into ``Stream``
Zachary Turner	11db264	2016-11-11 23:57:40 +0000	[diff] [blame]	343	}
				344	};
				345	}
				346	namespace llvm {
				347	void foo() {
				348	std::string S = formatv("{0}", anything::format_int_custom(42));
				349	}
				350	}
				351
Pavel Labath	08c2e86	2016-12-15 09:40:27 +0000	[diff] [blame]	352	If the type is detected to be derived from ``FormatAdapter<T>``, ``formatv``
				353	will call the
Zachary Turner	11db264	2016-11-11 23:57:40 +0000	[diff] [blame]	354	``format`` method on the argument passing in the specified style. This allows
				355	one to provide custom formatting of any type, including one which already has
				356	a builtin format provider.
				357
				358	``formatv`` Examples
				359	^^^^^^^^^^^^^^^^^^^^
				360	Below is intended to provide an incomplete set of examples demonstrating
				361	the usage of ``formatv``. More information can be found by reading the
				362	doxygen documentation or by looking at the unit test suite.
				363
				364
				365	.. code-block:: c++
				366
				367	std::string S;
				368	// Simple formatting of basic types and implicit string conversion.
				369	S = formatv("{0} ({1:P})", 7, 0.35); // S == "7 (35.00%)"
				370
				371	// Out-of-order referencing and multi-referencing
				372	outs() << formatv("{0} {2} {1} {0}", 1, "test", 3); // prints "1 3 test 1"
				373
				374	// Left, right, and center alignment
				375	S = formatv("{0,7}", 'a'); // S == " a";
				376	S = formatv("{0,-7}", 'a'); // S == "a ";
				377	S = formatv("{0,=7}", 'a'); // S == " a ";
				378	S = formatv("{0,+7}", 'a'); // S == " a";
				379
				380	// Custom styles
				381	S = formatv("{0:N} - {0:x} - {1:E}", 12345, 123908342); // S == "12,345 - 0x3039 - 1.24E8"
				382
				383	// Adapters
				384	S = formatv("{0}", fmt_align(42, AlignStyle::Center, 7)); // S == " 42 "
				385	S = formatv("{0}", fmt_repeat("hi", 3)); // S == "hihihi"
				386	S = formatv("{0}", fmt_pad("hi", 2, 6)); // S == " hi "
				387
				388	// Ranges
				389	std::vector<int> V = {8, 9, 10};
				390	S = formatv("{0}", make_range(V.begin(), V.end())); // S == "8, 9, 10"
				391	S = formatv("{0:$[+]}", make_range(V.begin(), V.end())); // S == "8+9+10"
				392	S = formatv("{0:$[ + ]@[x]}", make_range(V.begin(), V.end())); // S == "0x8 + 0x9 + 0xA"
				393
Lang Hames	f7f6d3e	2016-03-16 01:02:46 +0000	[diff] [blame]	394	.. _error_apis:
				395
				396	Error handling
				397	--------------
				398
				399	Proper error handling helps us identify bugs in our code, and helps end-users
				400	understand errors in their tool usage. Errors fall into two broad categories:
				401	programmatic and recoverable, with different strategies for handling and
				402	reporting.
				403
				404	Programmatic Errors
				405	^^^^^^^^^^^^^^^^^^^
				406
				407	Programmatic errors are violations of program invariants or API contracts, and
				408	represent bugs within the program itself. Our aim is to document invariants, and
				409	to abort quickly at the point of failure (providing some basic diagnostic) when
				410	invariants are broken at runtime.
				411
				412	The fundamental tools for handling programmatic errors are assertions and the
				413	llvm_unreachable function. Assertions are used to express invariant conditions,
				414	and should include a message describing the invariant:
				415
				416	.. code-block:: c++
				417
				418	assert(isPhysReg(R) && "All virt regs should have been allocated already.");
				419
				420	The llvm_unreachable function can be used to document areas of control flow
				421	that should never be entered if the program invariants hold:
				422
				423	.. code-block:: c++
				424
				425	enum { Foo, Bar, Baz } X = foo();
				426
				427	switch (X) {
				428	case Foo: /* Handle Foo */; break;
				429	case Bar: /* Handle Bar */; break;
				430	default:
				431	llvm_unreachable("X should be Foo or Bar here");
				432	}
				433
				434	Recoverable Errors
				435	^^^^^^^^^^^^^^^^^^
				436
				437	Recoverable errors represent an error in the program's environment, for example
				438	a resource failure (a missing file, a dropped network connection, etc.), or
				439	malformed input. These errors should be detected and communicated to a level of
				440	the program where they can be handled appropriately. Handling the error may be
				441	as simple as reporting the issue to the user, or it may involve attempts at
				442	recovery.
				443
				444	Recoverable errors are modeled using LLVM's ``Error`` scheme. This scheme
				445	represents errors using function return values, similar to classic C integer
				446	error codes, or C++'s ``std::error_code``. However, the ``Error`` class is
				447	actually a lightweight wrapper for user-defined error types, allowing arbitrary
				448	information to be attached to describe the error. This is similar to the way C++
				449	exceptions allow throwing of user-defined types.
				450
Lang Hames	42f5dd8	2016-09-02 03:46:08 +0000	[diff] [blame]	451	Success values are created by calling ``Error::success()``, E.g.:
Lang Hames	f7f6d3e	2016-03-16 01:02:46 +0000	[diff] [blame]	452
				453	.. code-block:: c++
				454
				455	Error foo() {
				456	// Do something.
				457	// Return success.
				458	return Error::success();
				459	}
				460
				461	Success values are very cheap to construct and return - they have minimal
				462	impact on program performance.
				463
				464	Failure values are constructed using ``make_error<T>``, where ``T`` is any class
Lang Hames	42f5dd8	2016-09-02 03:46:08 +0000	[diff] [blame]	465	that inherits from the ErrorInfo utility, E.g.:
Lang Hames	f7f6d3e	2016-03-16 01:02:46 +0000	[diff] [blame]	466
				467	.. code-block:: c++
Kostya Serebryany	af67fd1	2016-10-27 20:14:03 +0000	[diff] [blame]	468
Lang Hames	03a88cc	2016-10-25 21:19:30 +0000	[diff] [blame]	469	class BadFileFormat : public ErrorInfo<BadFileFormat> {
Lang Hames	f7f6d3e	2016-03-16 01:02:46 +0000	[diff] [blame]	470	public:
Reid Kleckner	a15b76b	2016-03-24 23:49:34 +0000	[diff] [blame]	471	static char ID;
Lang Hames	03a88cc	2016-10-25 21:19:30 +0000	[diff] [blame]	472	std::string Path;
				473
				474	BadFileFormat(StringRef Path) : Path(Path.str()) {}
				475
				476	void log(raw_ostream &OS) const override {
				477	OS << Path << " is malformed";
				478	}
				479
				480	std::error_code convertToErrorCode() const override {
				481	return make_error_code(object_error::parse_failed);
				482	}
Lang Hames	f7f6d3e	2016-03-16 01:02:46 +0000	[diff] [blame]	483	};
				484
Lang Hames	03a88cc	2016-10-25 21:19:30 +0000	[diff] [blame]	485	char FileExists::ID; // This should be declared in the C++ file.
Reid Kleckner	a15b76b	2016-03-24 23:49:34 +0000	[diff] [blame]	486
Lang Hames	03a88cc	2016-10-25 21:19:30 +0000	[diff] [blame]	487	Error printFormattedFile(StringRef Path) {
				488	if (<check for valid format>)
				489	return make_error<InvalidObjectFile>(Path);
				490	// print file contents.
Lang Hames	f7f6d3e	2016-03-16 01:02:46 +0000	[diff] [blame]	491	return Error::success();
				492	}
				493
Lang Hames	a0f517f	2016-03-23 03:18:16 +0000	[diff] [blame]	494	Error values can be implicitly converted to bool: true for error, false for
				495	success, enabling the following idiom:
				496
Justin Bogner	91269bf	2016-03-23 22:54:19 +0000	[diff] [blame]	497	.. code-block:: c++
Lang Hames	a0f517f	2016-03-23 03:18:16 +0000	[diff] [blame]	498
Lang Hames	1684d7c	2016-03-24 18:05:21 +0000	[diff] [blame]	499	Error mayFail();
Lang Hames	a0f517f	2016-03-23 03:18:16 +0000	[diff] [blame]	500
Lang Hames	1684d7c	2016-03-24 18:05:21 +0000	[diff] [blame]	501	Error foo() {
				502	if (auto Err = mayFail())
				503	return Err;
				504	// Success! We can proceed.
				505	...
Lang Hames	a0f517f	2016-03-23 03:18:16 +0000	[diff] [blame]	506
Lang Hames	f7f6d3e	2016-03-16 01:02:46 +0000	[diff] [blame]	507	For functions that can fail but need to return a value the ``Expected<T>``
				508	utility can be used. Values of this type can be constructed with either a
Lang Hames	42f5dd8	2016-09-02 03:46:08 +0000	[diff] [blame]	509	``T``, or an ``Error``. Expected<T> values are also implicitly convertible to
Lang Hames	03a88cc	2016-10-25 21:19:30 +0000	[diff] [blame]	510	boolean, but with the opposite convention to ``Error``: true for success, false
				511	for error. If success, the ``T`` value can be accessed via the dereference
				512	operator. If failure, the ``Error`` value can be extracted using the
				513	``takeError()`` method. Idiomatic usage looks like:
Lang Hames	f7f6d3e	2016-03-16 01:02:46 +0000	[diff] [blame]	514
				515	.. code-block:: c++
				516
Lang Hames	03a88cc	2016-10-25 21:19:30 +0000	[diff] [blame]	517	Expected<FormattedFile> openFormattedFile(StringRef Path) {
				518	// If badly formatted, return an error.
				519	if (auto Err = checkFormat(Path))
				520	return std::move(Err);
				521	// Otherwise return a FormattedFile instance.
				522	return FormattedFile(Path);
Lang Hames	f7f6d3e	2016-03-16 01:02:46 +0000	[diff] [blame]	523	}
				524
Lang Hames	03a88cc	2016-10-25 21:19:30 +0000	[diff] [blame]	525	Error processFormattedFile(StringRef Path) {
				526	// Try to open a formatted file
				527	if (auto FileOrErr = openFormattedFile(Path)) {
				528	// On success, grab a reference to the file and continue.
				529	auto &File = *FileOrErr;
Lang Hames	497fd94	2016-10-25 22:41:54 +0000	[diff] [blame]	530	...
Lang Hames	ca20d9e	2016-10-25 22:38:50 +0000	[diff] [blame]	531	} else
				532	// On error, extract the Error value and return it.
Lang Hames	03a88cc	2016-10-25 21:19:30 +0000	[diff] [blame]	533	return FileOrErr.takeError();
Lang Hames	f7f6d3e	2016-03-16 01:02:46 +0000	[diff] [blame]	534	}
				535
Lang Hames	03a88cc	2016-10-25 21:19:30 +0000	[diff] [blame]	536	If an ``Expected<T>`` value is in success mode then the ``takeError()`` method
				537	will return a success value. Using this fact, the above function can be
				538	rewritten as:
				539
				540	.. code-block:: c++
				541
				542	Error processFormattedFile(StringRef Path) {
				543	// Try to open a formatted file
				544	auto FileOrErr = openFormattedFile(Path);
				545	if (auto Err = FileOrErr.takeError())
				546	// On error, extract the Error value and return it.
				547	return Err;
				548	// On success, grab a reference to the file and continue.
				549	auto &File = *FileOrErr;
Lang Hames	497fd94	2016-10-25 22:41:54 +0000	[diff] [blame]	550	...
Lang Hames	03a88cc	2016-10-25 21:19:30 +0000	[diff] [blame]	551	}
				552
				553	This second form is often more readable for functions that involve multiple
				554	``Expected<T>`` values as it limits the indentation required.
				555
				556	All ``Error`` instances, whether success or failure, must be either checked or
				557	moved from (via ``std::move`` or a return) before they are destructed.
				558	Accidentally discarding an unchecked error will cause a program abort at the
				559	point where the unchecked value's destructor is run, making it easy to identify
				560	and fix violations of this rule.
Lang Hames	f7f6d3e	2016-03-16 01:02:46 +0000	[diff] [blame]	561
				562	Success values are considered checked once they have been tested (by invoking
				563	the boolean conversion operator):
				564
				565	.. code-block:: c++
				566
				567	if (auto Err = canFail(...))
				568	return Err; // Failure value - move error to caller.
				569
				570	// Safe to continue: Err was checked.
				571
Lang Hames	c5d41d4	2016-09-02 03:50:50 +0000	[diff] [blame]	572	In contrast, the following code will always cause an abort, even if ``canFail``
				573	returns a success value:
Lang Hames	f7f6d3e	2016-03-16 01:02:46 +0000	[diff] [blame]	574
				575	.. code-block:: c++
				576
				577	canFail();
				578	// Program will always abort here, even if canFail() returns Success, since
				579	// the value is not checked.
				580
				581	Failure values are considered checked once a handler for the error type has
				582	been activated:
				583
				584	.. code-block:: c++
				585
Lang Hames	03a88cc	2016-10-25 21:19:30 +0000	[diff] [blame]	586	handleErrors(
Kostya Serebryany	a1f87e5	2016-10-31 21:10:26 +0000	[diff] [blame]	587	processFormattedFile(...),
Lang Hames	03a88cc	2016-10-25 21:19:30 +0000	[diff] [blame]	588	[](const BadFileFormat &BFF) {
Kostya Serebryany	a1f87e5	2016-10-31 21:10:26 +0000	[diff] [blame]	589	report("Unable to process " + BFF.Path + ": bad format");
Lang Hames	03a88cc	2016-10-25 21:19:30 +0000	[diff] [blame]	590	},
				591	[](const FileNotFound &FNF) {
				592	report("File not found " + FNF.Path);
				593	});
Lang Hames	f7f6d3e	2016-03-16 01:02:46 +0000	[diff] [blame]	594
Lang Hames	03a88cc	2016-10-25 21:19:30 +0000	[diff] [blame]	595	The ``handleErrors`` function takes an error as its first argument, followed by
				596	a variadic list of "handlers", each of which must be a callable type (a
				597	function, lambda, or class with a call operator) with one argument. The
				598	``handleErrors`` function will visit each handler in the sequence and check its
				599	argument type against the dynamic type of the error, running the first handler
Lang Hames	19a2308	2016-11-07 22:33:13 +0000	[diff] [blame]	600	that matches. This is the same decision process that is used decide which catch
				601	clause to run for a C++ exception.
Lang Hames	f7f6d3e	2016-03-16 01:02:46 +0000	[diff] [blame]	602
Lang Hames	03a88cc	2016-10-25 21:19:30 +0000	[diff] [blame]	603	Since the list of handlers passed to ``handleErrors`` may not cover every error
				604	type that can occur, the ``handleErrors`` function also returns an Error value
				605	that must be checked or propagated. If the error value that is passed to
				606	``handleErrors`` does not match any of the handlers it will be returned from
				607	handleErrors. Idiomatic use of ``handleErrors`` thus looks like:
				608
				609	.. code-block:: c++
				610
				611	if (auto Err =
				612	handleErrors(
				613	processFormattedFile(...),
				614	[](const BadFileFormat &BFF) {
Kostya Serebryany	b848eaf	2016-11-01 05:51:12 +0000	[diff] [blame]	615	report("Unable to process " + BFF.Path + ": bad format");
Lang Hames	03a88cc	2016-10-25 21:19:30 +0000	[diff] [blame]	616	},
				617	[](const FileNotFound &FNF) {
				618	report("File not found " + FNF.Path);
				619	}))
				620	return Err;
				621
				622	In cases where you truly know that the handler list is exhaustive the
				623	``handleAllErrors`` function can be used instead. This is identical to
				624	``handleErrors`` except that it will terminate the program if an unhandled
				625	error is passed in, and can therefore return void. The ``handleAllErrors``
				626	function should generally be avoided: the introduction of a new error type
				627	elsewhere in the program can easily turn a formerly exhaustive list of errors
				628	into a non-exhaustive list, risking unexpected program termination. Where
				629	possible, use handleErrors and propagate unknown errors up the stack instead.
				630
Lang Hames	19a2308	2016-11-07 22:33:13 +0000	[diff] [blame]	631	For tool code, where errors can be handled by printing an error message then
				632	exiting with an error code, the :ref:`ExitOnError <err_exitonerr>` utility
				633	may be a better choice than handleErrors, as it simplifies control flow when
				634	calling fallible functions.
				635
Lang Hames	03a88cc	2016-10-25 21:19:30 +0000	[diff] [blame]	636	StringError
				637	"""""""""""
				638
				639	Many kinds of errors have no recovery strategy, the only action that can be
				640	taken is to report them to the user so that the user can attempt to fix the
				641	environment. In this case representing the error as a string makes perfect
Lang Hames	6b19ce6	2016-10-25 22:22:48 +0000	[diff] [blame]	642	sense. LLVM provides the ``StringError`` class for this purpose. It takes two
Lang Hames	03a88cc	2016-10-25 21:19:30 +0000	[diff] [blame]	643	arguments: A string error message, and an equivalent ``std::error_code`` for
				644	interoperability:
				645
				646	.. code-block:: c++
				647
				648	make_error<StringError>("Bad executable",
				649	make_error_code(errc::executable_format_error"));
				650
				651	If you're certain that the error you're building will never need to be converted
				652	to a ``std::error_code`` you can use the ``inconvertibleErrorCode()`` function:
				653
				654	.. code-block:: c++
				655
				656	make_error<StringError>("Bad executable", inconvertibleErrorCode());
				657
				658	This should be done only after careful consideration. If any attempt is made to
				659	convert this error to a ``std::error_code`` it will trigger immediate program
				660	termination. Unless you are certain that your errors will not need
				661	interoperability you should look for an existing ``std::error_code`` that you
				662	can convert to, and even (as painful as it is) consider introducing a new one as
				663	a stopgap measure.
				664
				665	Interoperability with std::error_code and ErrorOr
				666	"""""""""""""""""""""""""""""""""""""""""""""""""
				667
				668	Many existing LLVM APIs use ``std::error_code`` and its partner ``ErrorOr<T>``
				669	(which plays the same role as ``Expected<T>``, but wraps a ``std::error_code``
				670	rather than an ``Error``). The infectious nature of error types means that an
				671	attempt to change one of these functions to return ``Error`` or ``Expected<T>``
				672	instead often results in an avalanche of changes to callers, callers of callers,
				673	and so on. (The first such attempt, returning an ``Error`` from
Kostya Serebryany	b848eaf	2016-11-01 05:51:12 +0000	[diff] [blame]	674	MachOObjectFile's constructor, was abandoned after the diff reached 3000 lines,
Lang Hames	03a88cc	2016-10-25 21:19:30 +0000	[diff] [blame]	675	impacted half a dozen libraries, and was still growing).
				676
				677	To solve this problem, the ``Error``/``std::error_code`` interoperability requirement was
				678	introduced. Two pairs of functions allow any ``Error`` value to be converted to a
				679	``std::error_code``, any ``Expected<T>`` to be converted to an ``ErrorOr<T>``, and vice
				680	versa:
				681
				682	.. code-block:: c++
				683
				684	std::error_code errorToErrorCode(Error Err);
				685	Error errorCodeToError(std::error_code EC);
				686
				687	template <typename T> ErrorOr<T> expectedToErrorOr(Expected<T> TOrErr);
				688	template <typename T> Expected<T> errorOrToExpected(ErrorOr<T> TOrEC);
				689
				690
				691	Using these APIs it is easy to make surgical patches that update individual
				692	functions from ``std::error_code`` to ``Error``, and from ``ErrorOr<T>`` to
				693	``Expected<T>``.
				694
				695	Returning Errors from error handlers
				696	""""""""""""""""""""""""""""""""""""
				697
				698	Error recovery attempts may themselves fail. For that reason, ``handleErrors``
				699	actually recognises three different forms of handler signature:
				700
				701	.. code-block:: c++
				702
				703	// Error must be handled, no new errors produced:
				704	void(UserDefinedError &E);
				705
				706	// Error must be handled, new errors can be produced:
				707	Error(UserDefinedError &E);
				708
				709	// Original error can be inspected, then re-wrapped and returned (or a new
				710	// error can be produced):
				711	Error(std::unique_ptr<UserDefinedError> E);
				712
				713	Any error returned from a handler will be returned from the ``handleErrors``
				714	function so that it can be handled itself, or propagated up the stack.
				715
Lang Hames	19a2308	2016-11-07 22:33:13 +0000	[diff] [blame]	716	.. _err_exitonerr:
				717
Lang Hames	03a88cc	2016-10-25 21:19:30 +0000	[diff] [blame]	718	Using ExitOnError to simplify tool code
				719	"""""""""""""""""""""""""""""""""""""""
				720
				721	Library code should never call ``exit`` for a recoverable error, however in tool
Lang Hames	6b19ce6	2016-10-25 22:22:48 +0000	[diff] [blame]	722	code (especially command line tools) this can be a reasonable approach. Calling
Lang Hames	03a88cc	2016-10-25 21:19:30 +0000	[diff] [blame]	723	``exit`` upon encountering an error dramatically simplifies control flow as the
				724	error no longer needs to be propagated up the stack. This allows code to be
				725	written in straight-line style, as long as each fallible call is wrapped in a
Lang Hames	4f8a960	2016-10-25 22:35:55 +0000	[diff] [blame]	726	check and call to exit. The ``ExitOnError`` class supports this pattern by
Lang Hames	03a88cc	2016-10-25 21:19:30 +0000	[diff] [blame]	727	providing call operators that inspect ``Error`` values, stripping the error away
				728	in the success case and logging to ``stderr`` then exiting in the failure case.
				729
				730	To use this class, declare a global ``ExitOnError`` variable in your program:
				731
				732	.. code-block:: c++
				733
				734	ExitOnError ExitOnErr;
				735
				736	Calls to fallible functions can then be wrapped with a call to ``ExitOnErr``,
				737	turning them into non-failing calls:
				738
				739	.. code-block:: c++
				740
				741	Error mayFail();
				742	Expected<int> mayFail2();
				743
				744	void foo() {
				745	ExitOnErr(mayFail());
				746	int X = ExitOnErr(mayFail2());
				747	}
				748
Kostya Serebryany	b848eaf	2016-11-01 05:51:12 +0000	[diff] [blame]	749	On failure, the error's log message will be written to ``stderr``, optionally
				750	preceded by a string "banner" that can be set by calling the setBanner method. A
Lang Hames	03a88cc	2016-10-25 21:19:30 +0000	[diff] [blame]	751	mapping can also be supplied from ``Error`` values to exit codes using the
				752	``setExitCodeMapper`` method:
				753
Lang Hames	7a9ca3337	2016-10-25 22:25:07 +0000	[diff] [blame]	754	.. code-block:: c++
				755
				756	int main(int argc, char *argv[]) {
Kostya Serebryany	b848eaf	2016-11-01 05:51:12 +0000	[diff] [blame]	757	ExitOnErr.setBanner(std::string(argv[0]) + " error:");
Lang Hames	7a9ca3337	2016-10-25 22:25:07 +0000	[diff] [blame]	758	ExitOnErr.setExitCodeMapper(
				759	[](const Error &Err) {
				760	if (Err.isA<BadFileFormat>())
				761	return 2;
				762	return 1;
				763	});
Lang Hames	03a88cc	2016-10-25 21:19:30 +0000	[diff] [blame]	764
				765	Use ``ExitOnError`` in your tool code where possible as it can greatly improve
				766	readability.
				767
				768	Fallible constructors
				769	"""""""""""""""""""""
				770
				771	Some classes require resource acquisition or other complex initialization that
Kostya Serebryany	b848eaf	2016-11-01 05:51:12 +0000	[diff] [blame]	772	can fail during construction. Unfortunately constructors can't return errors,
				773	and having clients test objects after they're constructed to ensure that they're
				774	valid is error prone as it's all too easy to forget the test. To work around
Lang Hames	03a88cc	2016-10-25 21:19:30 +0000	[diff] [blame]	775	this, use the named constructor idiom and return an ``Expected<T>``:
				776
				777	.. code-block:: c++
				778
				779	class Foo {
				780	public:
				781
Lang Hames	4f8a960	2016-10-25 22:35:55 +0000	[diff] [blame]	782	static Expected<Foo> Create(Resource R1, Resource R2) {
Lang Hames	03a88cc	2016-10-25 21:19:30 +0000	[diff] [blame]	783	Error Err;
				784	Foo F(R1, R2, Err);
				785	if (Err)
				786	return std::move(Err);
				787	return std::move(F);
				788	}
				789
				790	private:
				791
				792	Foo(Resource R1, Resource R2, Error &Err) {
				793	ErrorAsOutParameter EAO(&Err);
				794	if (auto Err2 = R1.acquire()) {
				795	Err = std::move(Err2);
				796	return;
				797	}
				798	Err = R2.acquire();
				799	}
				800	};
				801
				802
				803	Here, the named constructor passes an ``Error`` by reference into the actual
				804	constructor, which the constructor can then use to return errors. The
				805	``ErrorAsOutParameter`` utility sets the ``Error`` value's checked flag on entry
				806	to the constructor so that the error can be assigned to, then resets it on exit
				807	to force the client (the named constructor) to check the error.
				808
				809	By using this idiom, clients attempting to construct a Foo receive either a
				810	well-formed Foo or an Error, never an object in an invalid state.
				811
				812	Propagating and consuming errors based on types
				813	"""""""""""""""""""""""""""""""""""""""""""""""
				814
				815	In some contexts, certain types of error are known to be benign. For example,
				816	when walking an archive, some clients may be happy to skip over badly formatted
				817	object files rather than terminating the walk immediately. Skipping badly
Lang Hames	4f8a960	2016-10-25 22:35:55 +0000	[diff] [blame]	818	formatted objects could be achieved using an elaborate handler method, but the
Lang Hames	03a88cc	2016-10-25 21:19:30 +0000	[diff] [blame]	819	Error.h header provides two utilities that make this idiom much cleaner: the
				820	type inspection method, ``isA``, and the ``consumeError`` function:
				821
				822	.. code-block:: c++
				823
				824	Error walkArchive(Archive A) {
				825	for (unsigned I = 0; I != A.numMembers(); ++I) {
				826	auto ChildOrErr = A.getMember(I);
Lang Hames	4f8a960	2016-10-25 22:35:55 +0000	[diff] [blame]	827	if (auto Err = ChildOrErr.takeError()) {
Lang Hames	03a88cc	2016-10-25 21:19:30 +0000	[diff] [blame]	828	if (Err.isA<BadFileFormat>())
				829	consumeError(std::move(Err))
				830	else
				831	return Err;
Lang Hames	4f8a960	2016-10-25 22:35:55 +0000	[diff] [blame]	832	}
				833	auto &Child = *ChildOrErr;
Lang Hames	497fd94	2016-10-25 22:41:54 +0000	[diff] [blame]	834	// Use Child
				835	...
Lang Hames	03a88cc	2016-10-25 21:19:30 +0000	[diff] [blame]	836	}
				837	return Error::success();
				838	}
				839
				840	Concatenating Errors with joinErrors
				841	""""""""""""""""""""""""""""""""""""
				842
				843	In the archive walking example above ``BadFileFormat`` errors are simply
				844	consumed and ignored. If the client had wanted report these errors after
				845	completing the walk over the archive they could use the ``joinErrors`` utility:
				846
				847	.. code-block:: c++
				848
				849	Error walkArchive(Archive A) {
				850	Error DeferredErrs = Error::success();
				851	for (unsigned I = 0; I != A.numMembers(); ++I) {
				852	auto ChildOrErr = A.getMember(I);
				853	if (auto Err = ChildOrErr.takeError())
				854	if (Err.isA<BadFileFormat>())
				855	DeferredErrs = joinErrors(std::move(DeferredErrs), std::move(Err));
				856	else
				857	return Err;
				858	auto &Child = *ChildOrErr;
Lang Hames	497fd94	2016-10-25 22:41:54 +0000	[diff] [blame]	859	// Use Child
				860	...
Lang Hames	03a88cc	2016-10-25 21:19:30 +0000	[diff] [blame]	861	}
				862	return DeferredErrs;
				863	}
				864
				865	The ``joinErrors`` routine builds a special error type called ``ErrorList``,
				866	which holds a list of user defined errors. The ``handleErrors`` routine
				867	recognizes this type and will attempt to handle each of the contained erorrs in
				868	order. If all contained errors can be handled, ``handleErrors`` will return
				869	``Error::success()``, otherwise ``handleErrors`` will concatenate the remaining
				870	errors and return the resulting ``ErrorList``.
				871
				872	Building fallible iterators and iterator ranges
				873	"""""""""""""""""""""""""""""""""""""""""""""""
				874
				875	The archive walking examples above retrieve archive members by index, however
				876	this requires considerable boiler-plate for iteration and error checking. We can
Lang Hames	8009f61	2016-10-25 23:08:32 +0000	[diff] [blame]	877	clean this up by using ``Error`` with the "fallible iterator" pattern. The usual
				878	C++ iterator patterns do not allow for failure on increment, but we can
				879	incorporate support for it by having iterators hold an Error reference through
				880	which they can report failure. In this pattern, if an increment operation fails
				881	the failure is recorded via the Error reference and the iterator value is set to
				882	the end of the range in order to terminate the loop. This ensures that the
				883	dereference operation is safe anywhere that an ordinary iterator dereference
				884	would be safe (i.e. when the iterator is not equal to end). Where this pattern
				885	is followed (as in the ``llvm::object::Archive`` class) the result is much
				886	cleaner iteration idiom:
Lang Hames	03a88cc	2016-10-25 21:19:30 +0000	[diff] [blame]	887
				888	.. code-block:: c++
				889
				890	Error Err;
				891	for (auto &Child : Ar->children(Err)) {
Kostya Serebryany	b848eaf	2016-11-01 05:51:12 +0000	[diff] [blame]	892	// Use Child - we only enter the loop when it's valid
Lang Hames	497fd94	2016-10-25 22:41:54 +0000	[diff] [blame]	893	...
Lang Hames	03a88cc	2016-10-25 21:19:30 +0000	[diff] [blame]	894	}
Kostya Serebryany	b848eaf	2016-11-01 05:51:12 +0000	[diff] [blame]	895	// Check Err after the loop to ensure it didn't break due to an error.
Lang Hames	03a88cc	2016-10-25 21:19:30 +0000	[diff] [blame]	896	if (Err)
				897	return Err;
Lang Hames	f7f6d3e	2016-03-16 01:02:46 +0000	[diff] [blame]	898
Richard Smith	ddb2fde	2014-05-06 07:45:39 +0000	[diff] [blame]	899	.. _function_apis:
				900
Lang Hames	f7f6d3e	2016-03-16 01:02:46 +0000	[diff] [blame]	901	More information on Error and its related utilities can be found in the
				902	Error.h header file.
				903
Richard Smith	ddb2fde	2014-05-06 07:45:39 +0000	[diff] [blame]	904	Passing functions and other callable objects
				905	--------------------------------------------
				906
				907	Sometimes you may want a function to be passed a callback object. In order to
				908	support lambda expressions and other function objects, you should not use the
				909	traditional C approach of taking a function pointer and an opaque cookie:
				910
				911	.. code-block:: c++
				912
				913	void takeCallback(bool (Callback)(Function , void ), void Cookie);
				914
				915	Instead, use one of the following approaches:
				916
				917	Function template
				918	^^^^^^^^^^^^^^^^^
				919
				920	If you don't mind putting the definition of your function into a header file,
				921	make it a function template that is templated on the callable type.
				922
				923	.. code-block:: c++
				924
				925	template<typename Callable>
				926	void takeCallback(Callable Callback) {
				927	Callback(1, 2, 3);
				928	}
				929
				930	The ``function_ref`` class template
				931	^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
				932
				933	The ``function_ref``
Sean Silva	96faef2	2016-07-09 23:08:14 +0000	[diff] [blame]	934	(`doxygen <http://llvm.org/docs/doxygen/html/classllvm_1_1function__ref_3_01Ret_07Params_8_8_8_08_4.html>`__) class
Richard Smith	ddb2fde	2014-05-06 07:45:39 +0000	[diff] [blame]	935	template represents a reference to a callable object, templated over the type
				936	of the callable. This is a good choice for passing a callback to a function,
Reid Kleckner	5c2245b	2014-07-17 22:43:00 +0000	[diff] [blame]	937	if you don't need to hold onto the callback after the function returns. In this
				938	way, ``function_ref`` is to ``std::function`` as ``StringRef`` is to
				939	``std::string``.
Richard Smith	ddb2fde	2014-05-06 07:45:39 +0000	[diff] [blame]	940
				941	``function_ref<Ret(Param1, Param2, ...)>`` can be implicitly constructed from
				942	any callable object that can be called with arguments of type ``Param1``,
				943	``Param2``, ..., and returns a value that can be converted to type ``Ret``.
				944	For example:
				945
				946	.. code-block:: c++
				947
				948	void visitBasicBlocks(Function F, function_ref<bool (BasicBlock)> Callback) {
				949	for (BasicBlock &BB : *F)
				950	if (Callback(&BB))
				951	return;
				952	}
				953
				954	can be called using:
				955
				956	.. code-block:: c++
				957
				958	visitBasicBlocks(F, [&](BasicBlock *BB) {
				959	if (process(BB))
				960	return isEmpty(BB);
				961	return false;
				962	});
				963
Reid Kleckner	5c2245b	2014-07-17 22:43:00 +0000	[diff] [blame]	964	Note that a ``function_ref`` object contains pointers to external memory, so it
				965	is not generally safe to store an instance of the class (unless you know that
				966	the external storage will not be freed). If you need this ability, consider
				967	using ``std::function``. ``function_ref`` is small enough that it should always
				968	be passed by value.
Richard Smith	ddb2fde	2014-05-06 07:45:39 +0000	[diff] [blame]	969
Sean Silva	beb15ca	2012-12-04 03:20:08 +0000	[diff] [blame]	970	.. _DEBUG:
				971
				972	The ``DEBUG()`` macro and ``-debug`` option
				973	-------------------------------------------
				974
				975	Often when working on your pass you will put a bunch of debugging printouts and
				976	other code into your pass. After you get it working, you want to remove it, but
				977	you may need it again in the future (to work out new bugs that you run across).
				978
				979	Naturally, because of this, you don't want to delete the debug printouts, but
				980	you don't want them to always be noisy. A standard compromise is to comment
				981	them out, allowing you to enable them if you need them in the future.
				982
				983	The ``llvm/Support/Debug.h`` (`doxygen
				984	<http://llvm.org/doxygen/Debug_8h-source.html>`__) file provides a macro named
				985	``DEBUG()`` that is a much nicer solution to this problem. Basically, you can
				986	put arbitrary code into the argument of the ``DEBUG`` macro, and it is only
				987	executed if '``opt``' (or any other tool) is run with the '``-debug``' command
				988	line argument:
				989
				990	.. code-block:: c++
				991
				992	DEBUG(errs() << "I am here!\n");
				993
				994	Then you can run your pass like this:
				995
				996	.. code-block:: none
				997
				998	$ opt < a.bc > /dev/null -mypass
				999	<no output>
				1000	$ opt < a.bc > /dev/null -mypass -debug
				1001	I am here!
				1002
				1003	Using the ``DEBUG()`` macro instead of a home-brewed solution allows you to not
				1004	have to create "yet another" command line option for the debug output for your
Justin Bogner	c2e54af	2015-10-15 18:17:44 +0000	[diff] [blame]	1005	pass. Note that ``DEBUG()`` macros are disabled for non-asserts builds, so they
Sean Silva	beb15ca	2012-12-04 03:20:08 +0000	[diff] [blame]	1006	do not cause a performance impact at all (for the same reason, they should also
				1007	not contain side-effects!).
				1008
				1009	One additional nice thing about the ``DEBUG()`` macro is that you can enable or
				1010	disable it directly in gdb. Just use "``set DebugFlag=0``" or "``set
				1011	DebugFlag=1``" from the gdb if the program is running. If the program hasn't
				1012	been started yet, you can always just run it with ``-debug``.
				1013
				1014	.. _DEBUG_TYPE:
				1015
				1016	Fine grained debug info with ``DEBUG_TYPE`` and the ``-debug-only`` option
				1017	^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
				1018
				1019	Sometimes you may find yourself in a situation where enabling ``-debug`` just
				1020	turns on too much information (such as when working on the code generator).
				1021	If you want to enable debug information with more fine-grained control, you
Justin Bogner	c2e54af	2015-10-15 18:17:44 +0000	[diff] [blame]	1022	should define the ``DEBUG_TYPE`` macro and use the ``-debug-only`` option as
Alexey Samsonov	6c0ddfe	2014-06-05 23:12:43 +0000	[diff] [blame]	1023	follows:
Sean Silva	beb15ca	2012-12-04 03:20:08 +0000	[diff] [blame]	1024
				1025	.. code-block:: c++
				1026
Sean Silva	beb15ca	2012-12-04 03:20:08 +0000	[diff] [blame]	1027	#define DEBUG_TYPE "foo"
				1028	DEBUG(errs() << "'foo' debug type\n");
				1029	#undef DEBUG_TYPE
				1030	#define DEBUG_TYPE "bar"
				1031	DEBUG(errs() << "'bar' debug type\n"));
				1032	#undef DEBUG_TYPE
Sean Silva	beb15ca	2012-12-04 03:20:08 +0000	[diff] [blame]	1033
				1034	Then you can run your pass like this:
				1035
				1036	.. code-block:: none
				1037
				1038	$ opt < a.bc > /dev/null -mypass
				1039	<no output>
				1040	$ opt < a.bc > /dev/null -mypass -debug
Sean Silva	beb15ca	2012-12-04 03:20:08 +0000	[diff] [blame]	1041	'foo' debug type
				1042	'bar' debug type
Sean Silva	beb15ca	2012-12-04 03:20:08 +0000	[diff] [blame]	1043	$ opt < a.bc > /dev/null -mypass -debug-only=foo
				1044	'foo' debug type
				1045	$ opt < a.bc > /dev/null -mypass -debug-only=bar
				1046	'bar' debug type
Christof Douma	f617e67	2016-01-12 10:23:13 +0000	[diff] [blame]	1047	$ opt < a.bc > /dev/null -mypass -debug-only=foo,bar
				1048	'foo' debug type
				1049	'bar' debug type
Sean Silva	beb15ca	2012-12-04 03:20:08 +0000	[diff] [blame]	1050
				1051	Of course, in practice, you should only set ``DEBUG_TYPE`` at the top of a file,
Justin Bogner	c2e54af	2015-10-15 18:17:44 +0000	[diff] [blame]	1052	to specify the debug type for the entire module. Be careful that you only do
				1053	this after including Debug.h and not around any #include of headers. Also, you
				1054	should use names more meaningful than "foo" and "bar", because there is no
				1055	system in place to ensure that names do not conflict. If two different modules
				1056	use the same string, they will all be turned on when the name is specified.
				1057	This allows, for example, all debug information for instruction scheduling to be
				1058	enabled with ``-debug-only=InstrSched``, even if the source lives in multiple
Sylvestre Ledru	84666a1	2016-02-14 20:16:22 +0000	[diff] [blame]	1059	files. The name must not include a comma (,) as that is used to separate the
Christof Douma	f617e67	2016-01-12 10:23:13 +0000	[diff] [blame]	1060	arguments of the ``-debug-only`` option.
Sean Silva	beb15ca	2012-12-04 03:20:08 +0000	[diff] [blame]	1061
Sylvestre Ledru	1623b46	2014-09-25 10:58:16 +0000	[diff] [blame]	1062	For performance reasons, -debug-only is not available in optimized build
				1063	(``--enable-optimized``) of LLVM.
Sylvestre Ledru	b5984fa	2014-09-25 10:57:00 +0000	[diff] [blame]	1064
Sean Silva	beb15ca	2012-12-04 03:20:08 +0000	[diff] [blame]	1065	The ``DEBUG_WITH_TYPE`` macro is also available for situations where you would
				1066	like to set ``DEBUG_TYPE``, but only for one specific ``DEBUG`` statement. It
				1067	takes an additional first parameter, which is the type to use. For example, the
				1068	preceding example could be written as:
				1069
				1070	.. code-block:: c++
				1071
Sean Silva	beb15ca	2012-12-04 03:20:08 +0000	[diff] [blame]	1072	DEBUG_WITH_TYPE("foo", errs() << "'foo' debug type\n");
				1073	DEBUG_WITH_TYPE("bar", errs() << "'bar' debug type\n"));
Sean Silva	beb15ca	2012-12-04 03:20:08 +0000	[diff] [blame]	1074
				1075	.. _Statistic:
				1076
				1077	The ``Statistic`` class & ``-stats`` option
				1078	-------------------------------------------
				1079
				1080	The ``llvm/ADT/Statistic.h`` (`doxygen
				1081	<http://llvm.org/doxygen/Statistic_8h-source.html>`__) file provides a class
				1082	named ``Statistic`` that is used as a unified way to keep track of what the LLVM
				1083	compiler is doing and how effective various optimizations are. It is useful to
				1084	see what optimizations are contributing to making a particular program run
				1085	faster.
				1086
				1087	Often you may run your pass on some big program, and you're interested to see
				1088	how many times it makes a certain transformation. Although you can do this with
				1089	hand inspection, or some ad-hoc method, this is a real pain and not very useful
				1090	for big programs. Using the ``Statistic`` class makes it very easy to keep
				1091	track of this information, and the calculated information is presented in a
				1092	uniform manner with the rest of the passes being executed.
				1093
				1094	There are many examples of ``Statistic`` uses, but the basics of using it are as
				1095	follows:
				1096
				1097	#. Define your statistic like this:
				1098
				1099	.. code-block:: c++
				1100
				1101	#define DEBUG_TYPE "mypassname" // This goes before any #includes.
				1102	STATISTIC(NumXForms, "The # of times I did stuff");
				1103
				1104	The ``STATISTIC`` macro defines a static variable, whose name is specified by
				1105	the first argument. The pass name is taken from the ``DEBUG_TYPE`` macro, and
				1106	the description is taken from the second argument. The variable defined
				1107	("NumXForms" in this case) acts like an unsigned integer.
				1108
				1109	#. Whenever you make a transformation, bump the counter:
				1110
				1111	.. code-block:: c++
				1112
				1113	++NumXForms; // I did stuff!
				1114
				1115	That's all you have to do. To get '``opt``' to print out the statistics
				1116	gathered, use the '``-stats``' option:
				1117
				1118	.. code-block:: none
				1119
				1120	$ opt -stats -mypassname < program.bc > /dev/null
				1121	... statistics output ...
				1122
Justin Bogner	08f36fd	2015-02-21 20:53:36 +0000	[diff] [blame]	1123	Note that in order to use the '``-stats``' option, LLVM must be
				1124	compiled with assertions enabled.
				1125
Sean Silva	beb15ca	2012-12-04 03:20:08 +0000	[diff] [blame]	1126	When running ``opt`` on a C file from the SPEC benchmark suite, it gives a
				1127	report that looks like this:
				1128
				1129	.. code-block:: none
				1130
				1131	7646 bitcodewriter - Number of normal instructions
				1132	725 bitcodewriter - Number of oversized instructions
				1133	129996 bitcodewriter - Number of bitcode bytes written
				1134	2817 raise - Number of insts DCEd or constprop'd
				1135	3213 raise - Number of cast-of-self removed
				1136	5046 raise - Number of expression trees converted
				1137	75 raise - Number of other getelementptr's formed
				1138	138 raise - Number of load/store peepholes
				1139	42 deadtypeelim - Number of unused typenames removed from symtab
				1140	392 funcresolve - Number of varargs functions resolved
				1141	27 globaldce - Number of global variables removed
				1142	2 adce - Number of basic blocks removed
				1143	134 cee - Number of branches revectored
				1144	49 cee - Number of setcc instruction eliminated
				1145	532 gcse - Number of loads removed
				1146	2919 gcse - Number of instructions removed
				1147	86 indvars - Number of canonical indvars added
				1148	87 indvars - Number of aux indvars removed
				1149	25 instcombine - Number of dead inst eliminate
				1150	434 instcombine - Number of insts combined
				1151	248 licm - Number of load insts hoisted
				1152	1298 licm - Number of insts hoisted to a loop pre-header
				1153	3 licm - Number of insts hoisted to multiple loop preds (bad, no loop pre-header)
				1154	75 mem2reg - Number of alloca's promoted
				1155	1444 cfgsimplify - Number of blocks simplified
				1156
				1157	Obviously, with so many optimizations, having a unified framework for this stuff
				1158	is very nice. Making your pass fit well into the framework makes it more
				1159	maintainable and useful.
				1160
				1161	.. _ViewGraph:
				1162
				1163	Viewing graphs while debugging code
				1164	-----------------------------------
				1165
				1166	Several of the important data structures in LLVM are graphs: for example CFGs
				1167	made out of LLVM :ref:`BasicBlocks <BasicBlock>`, CFGs made out of LLVM
				1168	:ref:`MachineBasicBlocks <MachineBasicBlock>`, and :ref:`Instruction Selection
				1169	DAGs <SelectionDAG>`. In many cases, while debugging various parts of the
				1170	compiler, it is nice to instantly visualize these graphs.
				1171
				1172	LLVM provides several callbacks that are available in a debug build to do
				1173	exactly that. If you call the ``Function::viewCFG()`` method, for example, the
				1174	current LLVM tool will pop up a window containing the CFG for the function where
				1175	each basic block is a node in the graph, and each node contains the instructions
				1176	in the block. Similarly, there also exists ``Function::viewCFGOnly()`` (does
				1177	not include the instructions), the ``MachineFunction::viewCFG()`` and
				1178	``MachineFunction::viewCFGOnly()``, and the ``SelectionDAG::viewGraph()``
				1179	methods. Within GDB, for example, you can usually use something like ``call
				1180	DAG.viewGraph()`` to pop up a window. Alternatively, you can sprinkle calls to
				1181	these functions in your code in places you want to debug.
				1182
Alp Toker	125be84	2014-06-02 01:40:04 +0000	[diff] [blame]	1183	Getting this to work requires a small amount of setup. On Unix systems
Sean Silva	beb15ca	2012-12-04 03:20:08 +0000	[diff] [blame]	1184	with X11, install the `graphviz <http://www.graphviz.org>`_ toolkit, and make
Nico Weber	ad15692	2014-03-07 18:08:54 +0000	[diff] [blame]	1185	sure 'dot' and 'gv' are in your path. If you are running on Mac OS X, download
				1186	and install the Mac OS X `Graphviz program
Sean Silva	beb15ca	2012-12-04 03:20:08 +0000	[diff] [blame]	1187	<http://www.pixelglow.com/graphviz/>`_ and add
				1188	``/Applications/Graphviz.app/Contents/MacOS/`` (or wherever you install it) to
Alp Toker	125be84	2014-06-02 01:40:04 +0000	[diff] [blame]	1189	your path. The programs need not be present when configuring, building or
				1190	running LLVM and can simply be installed when needed during an active debug
				1191	session.
Sean Silva	beb15ca	2012-12-04 03:20:08 +0000	[diff] [blame]	1192
				1193	``SelectionDAG`` has been extended to make it easier to locate interesting
				1194	nodes in large complex graphs. From gdb, if you ``call DAG.setGraphColor(node,
				1195	"color")``, then the next ``call DAG.viewGraph()`` would highlight the node in
				1196	the specified color (choices of colors can be found at `colors
				1197	<http://www.graphviz.org/doc/info/colors.html>`_.) More complex node attributes
				1198	can be provided with ``call DAG.setGraphAttrs(node, "attributes")`` (choices can
				1199	be found at `Graph attributes <http://www.graphviz.org/doc/info/attrs.html>`_.)
				1200	If you want to restart and clear all the current graph attributes, then you can
				1201	``call DAG.clearGraphAttrs()``.
				1202
				1203	Note that graph visualization features are compiled out of Release builds to
				1204	reduce file size. This means that you need a Debug+Asserts or Release+Asserts
				1205	build to use these features.
				1206
				1207	.. _datastructure:
				1208
				1209	Picking the Right Data Structure for a Task
				1210	===========================================
				1211
				1212	LLVM has a plethora of data structures in the ``llvm/ADT/`` directory, and we
				1213	commonly use STL data structures. This section describes the trade-offs you
				1214	should consider when you pick one.
				1215
				1216	The first step is a choose your own adventure: do you want a sequential
				1217	container, a set-like container, or a map-like container? The most important
				1218	thing when choosing a container is the algorithmic properties of how you plan to
				1219	access the container. Based on that, you should use:
				1220
				1221
				1222	* a :ref:`map-like <ds_map>` container if you need efficient look-up of a
				1223	value based on another value. Map-like containers also support efficient
				1224	queries for containment (whether a key is in the map). Map-like containers
				1225	generally do not support efficient reverse mapping (values to keys). If you
				1226	need that, use two maps. Some map-like containers also support efficient
				1227	iteration through the keys in sorted order. Map-like containers are the most
				1228	expensive sort, only use them if you need one of these capabilities.
				1229
				1230	* a :ref:`set-like <ds_set>` container if you need to put a bunch of stuff into
				1231	a container that automatically eliminates duplicates. Some set-like
				1232	containers support efficient iteration through the elements in sorted order.
				1233	Set-like containers are more expensive than sequential containers.
				1234
				1235	* a :ref:`sequential <ds_sequential>` container provides the most efficient way
				1236	to add elements and keeps track of the order they are added to the collection.
				1237	They permit duplicates and support efficient iteration, but do not support
				1238	efficient look-up based on a key.
				1239
				1240	* a :ref:`string <ds_string>` container is a specialized sequential container or
				1241	reference structure that is used for character or byte arrays.
				1242
				1243	* a :ref:`bit <ds_bit>` container provides an efficient way to store and
				1244	perform set operations on sets of numeric id's, while automatically
				1245	eliminating duplicates. Bit containers require a maximum of 1 bit for each
				1246	identifier you want to store.
				1247
				1248	Once the proper category of container is determined, you can fine tune the
				1249	memory use, constant factors, and cache behaviors of access by intelligently
				1250	picking a member of the category. Note that constant factors and cache behavior
				1251	can be a big deal. If you have a vector that usually only contains a few
				1252	elements (but could contain many), for example, it's much better to use
				1253	:ref:`SmallVector <dss_smallvector>` than :ref:`vector <dss_vector>`. Doing so
				1254	avoids (relatively) expensive malloc/free calls, which dwarf the cost of adding
				1255	the elements to the container.
				1256
				1257	.. _ds_sequential:
				1258
				1259	Sequential Containers (std::vector, std::list, etc)
				1260	---------------------------------------------------
				1261
				1262	There are a variety of sequential containers available for you, based on your
				1263	needs. Pick the first in this section that will do what you want.
				1264
				1265	.. _dss_arrayref:
				1266
				1267	llvm/ADT/ArrayRef.h
				1268	^^^^^^^^^^^^^^^^^^^
				1269
				1270	The ``llvm::ArrayRef`` class is the preferred class to use in an interface that
				1271	accepts a sequential list of elements in memory and just reads from them. By
				1272	taking an ``ArrayRef``, the API can be passed a fixed size array, an
				1273	``std::vector``, an ``llvm::SmallVector`` and anything else that is contiguous
				1274	in memory.
				1275
				1276	.. _dss_fixedarrays:
				1277
				1278	Fixed Size Arrays
				1279	^^^^^^^^^^^^^^^^^
				1280
				1281	Fixed size arrays are very simple and very fast. They are good if you know
				1282	exactly how many elements you have, or you have a (low) upper bound on how many
				1283	you have.
				1284
				1285	.. _dss_heaparrays:
				1286
				1287	Heap Allocated Arrays
				1288	^^^^^^^^^^^^^^^^^^^^^
				1289
				1290	Heap allocated arrays (``new[]`` + ``delete[]``) are also simple. They are good
				1291	if the number of elements is variable, if you know how many elements you will
				1292	need before the array is allocated, and if the array is usually large (if not,
				1293	consider a :ref:`SmallVector <dss_smallvector>`). The cost of a heap allocated
				1294	array is the cost of the new/delete (aka malloc/free). Also note that if you
				1295	are allocating an array of a type with a constructor, the constructor and
				1296	destructors will be run for every element in the array (re-sizable vectors only
				1297	construct those elements actually used).
				1298
				1299	.. _dss_tinyptrvector:
				1300
				1301	llvm/ADT/TinyPtrVector.h
				1302	^^^^^^^^^^^^^^^^^^^^^^^^
				1303
				1304	``TinyPtrVector<Type>`` is a highly specialized collection class that is
				1305	optimized to avoid allocation in the case when a vector has zero or one
				1306	elements. It has two major restrictions: 1) it can only hold values of pointer
				1307	type, and 2) it cannot hold a null pointer.
				1308
				1309	Since this container is highly specialized, it is rarely used.
				1310
				1311	.. _dss_smallvector:
				1312
				1313	llvm/ADT/SmallVector.h
				1314	^^^^^^^^^^^^^^^^^^^^^^
				1315
				1316	``SmallVector<Type, N>`` is a simple class that looks and smells just like
				1317	``vector<Type>``: it supports efficient iteration, lays out elements in memory
				1318	order (so you can do pointer arithmetic between elements), supports efficient
				1319	push_back/pop_back operations, supports efficient random access to its elements,
				1320	etc.
				1321
				1322	The advantage of SmallVector is that it allocates space for some number of
				1323	elements (N) in the object itself. Because of this, if the SmallVector is
				1324	dynamically smaller than N, no malloc is performed. This can be a big win in
				1325	cases where the malloc/free call is far more expensive than the code that
				1326	fiddles around with the elements.
				1327
				1328	This is good for vectors that are "usually small" (e.g. the number of
				1329	predecessors/successors of a block is usually less than 8). On the other hand,
				1330	this makes the size of the SmallVector itself large, so you don't want to
				1331	allocate lots of them (doing so will waste a lot of space). As such,
				1332	SmallVectors are most useful when on the stack.
				1333
				1334	SmallVector also provides a nice portable and efficient replacement for
				1335	``alloca``.
				1336
Sean Silva	4ee92f9	2013-03-22 23:41:29 +0000	[diff] [blame]	1337	.. note::
				1338
Sean Silva	4359068	2013-03-22 23:52:38 +0000	[diff] [blame]	1339	Prefer to use ``SmallVectorImpl<T>`` as a parameter type.
Sean Silva	4ee92f9	2013-03-22 23:41:29 +0000	[diff] [blame]	1340
				1341	In APIs that don't care about the "small size" (most?), prefer to use
				1342	the ``SmallVectorImpl<T>`` class, which is basically just the "vector
				1343	header" (and methods) without the elements allocated after it. Note that
				1344	``SmallVector<T, N>`` inherits from ``SmallVectorImpl<T>`` so the
				1345	conversion is implicit and costs nothing. E.g.
				1346
				1347	.. code-block:: c++
				1348
				1349	// BAD: Clients cannot pass e.g. SmallVector<Foo, 4>.
				1350	hardcodedSmallSize(SmallVector<Foo, 2> &Out);
				1351	// GOOD: Clients can pass any SmallVector<Foo, N>.
				1352	allowsAnySmallSize(SmallVectorImpl<Foo> &Out);
				1353
				1354	void someFunc() {
				1355	SmallVector<Foo, 8> Vec;
				1356	hardcodedSmallSize(Vec); // Error.
				1357	allowsAnySmallSize(Vec); // Works.
				1358	}
				1359
				1360	Even though it has "``Impl``" in the name, this is so widely used that
				1361	it really isn't "private to the implementation" anymore. A name like
				1362	``SmallVectorHeader`` would be more appropriate.
				1363
Sean Silva	beb15ca	2012-12-04 03:20:08 +0000	[diff] [blame]	1364	.. _dss_vector:
				1365
				1366	<vector>
				1367	^^^^^^^^
				1368
				1369	``std::vector`` is well loved and respected. It is useful when SmallVector
				1370	isn't: when the size of the vector is often large (thus the small optimization
				1371	will rarely be a benefit) or if you will be allocating many instances of the
				1372	vector itself (which would waste space for elements that aren't in the
				1373	container). vector is also useful when interfacing with code that expects
				1374	vectors :).
				1375
				1376	One worthwhile note about std::vector: avoid code like this:
				1377
				1378	.. code-block:: c++
				1379
				1380	for ( ... ) {
				1381	std::vector<foo> V;
				1382	// make use of V.
				1383	}
				1384
				1385	Instead, write this as:
				1386
				1387	.. code-block:: c++
				1388
				1389	std::vector<foo> V;
				1390	for ( ... ) {
				1391	// make use of V.
				1392	V.clear();
				1393	}
				1394
				1395	Doing so will save (at least) one heap allocation and free per iteration of the
				1396	loop.
				1397
				1398	.. _dss_deque:
				1399
				1400	<deque>
				1401	^^^^^^^
				1402
				1403	``std::deque`` is, in some senses, a generalized version of ``std::vector``.
				1404	Like ``std::vector``, it provides constant time random access and other similar
				1405	properties, but it also provides efficient access to the front of the list. It
				1406	does not guarantee continuity of elements within memory.
				1407
				1408	In exchange for this extra flexibility, ``std::deque`` has significantly higher
				1409	constant factor costs than ``std::vector``. If possible, use ``std::vector`` or
				1410	something cheaper.
				1411
				1412	.. _dss_list:
				1413
				1414	<list>
				1415	^^^^^^
				1416
				1417	``std::list`` is an extremely inefficient class that is rarely useful. It
				1418	performs a heap allocation for every element inserted into it, thus having an
				1419	extremely high constant factor, particularly for small data types.
				1420	``std::list`` also only supports bidirectional iteration, not random access
				1421	iteration.
				1422
				1423	In exchange for this high cost, std::list supports efficient access to both ends
				1424	of the list (like ``std::deque``, but unlike ``std::vector`` or
				1425	``SmallVector``). In addition, the iterator invalidation characteristics of
				1426	std::list are stronger than that of a vector class: inserting or removing an
				1427	element into the list does not invalidate iterator or pointers to other elements
				1428	in the list.
				1429
				1430	.. _dss_ilist:
				1431
				1432	llvm/ADT/ilist.h
				1433	^^^^^^^^^^^^^^^^
				1434
				1435	``ilist<T>`` implements an 'intrusive' doubly-linked list. It is intrusive,
				1436	because it requires the element to store and provide access to the prev/next
				1437	pointers for the list.
				1438
				1439	``ilist`` has the same drawbacks as ``std::list``, and additionally requires an
				1440	``ilist_traits`` implementation for the element type, but it provides some novel
				1441	characteristics. In particular, it can efficiently store polymorphic objects,
				1442	the traits class is informed when an element is inserted or removed from the
				1443	list, and ``ilist``\ s are guaranteed to support a constant-time splice
				1444	operation.
				1445
				1446	These properties are exactly what we want for things like ``Instruction``\ s and
				1447	basic blocks, which is why these are implemented with ``ilist``\ s.
				1448
				1449	Related classes of interest are explained in the following subsections:
				1450
				1451	* :ref:`ilist_traits <dss_ilist_traits>`
				1452
				1453	* :ref:`iplist <dss_iplist>`
				1454
				1455	* :ref:`llvm/ADT/ilist_node.h <dss_ilist_node>`
				1456
				1457	* :ref:`Sentinels <dss_ilist_sentinel>`
				1458
				1459	.. _dss_packedvector:
				1460
				1461	llvm/ADT/PackedVector.h
				1462	^^^^^^^^^^^^^^^^^^^^^^^
				1463
				1464	Useful for storing a vector of values using only a few number of bits for each
				1465	value. Apart from the standard operations of a vector-like container, it can
				1466	also perform an 'or' set operation.
				1467
				1468	For example:
				1469
				1470	.. code-block:: c++
				1471
				1472	enum State {
				1473	None = 0x0,
				1474	FirstCondition = 0x1,
				1475	SecondCondition = 0x2,
				1476	Both = 0x3
				1477	};
				1478
				1479	State get() {
				1480	PackedVector<State, 2> Vec1;
				1481	Vec1.push_back(FirstCondition);
				1482
				1483	PackedVector<State, 2> Vec2;
				1484	Vec2.push_back(SecondCondition);
				1485
				1486	Vec1 \|= Vec2;
				1487	return Vec1[0]; // returns 'Both'.
				1488	}
				1489
				1490	.. _dss_ilist_traits:
				1491
				1492	ilist_traits
				1493	^^^^^^^^^^^^
				1494
				1495	``ilist_traits<T>`` is ``ilist<T>``'s customization mechanism. ``iplist<T>``
				1496	(and consequently ``ilist<T>``) publicly derive from this traits class.
				1497
				1498	.. _dss_iplist:
				1499
				1500	iplist
				1501	^^^^^^
				1502
				1503	``iplist<T>`` is ``ilist<T>``'s base and as such supports a slightly narrower
				1504	interface. Notably, inserters from ``T&`` are absent.
				1505
				1506	``ilist_traits<T>`` is a public base of this class and can be used for a wide
				1507	variety of customizations.
				1508
				1509	.. _dss_ilist_node:
				1510
				1511	llvm/ADT/ilist_node.h
				1512	^^^^^^^^^^^^^^^^^^^^^
				1513
Robin Morisset	039781e	2014-08-29 21:53:01 +0000	[diff] [blame]	1514	``ilist_node<T>`` implements the forward and backward links that are expected
Sean Silva	beb15ca	2012-12-04 03:20:08 +0000	[diff] [blame]	1515	by the ``ilist<T>`` (and analogous containers) in the default manner.
				1516
				1517	``ilist_node<T>``\ s are meant to be embedded in the node type ``T``, usually
				1518	``T`` publicly derives from ``ilist_node<T>``.
				1519
				1520	.. _dss_ilist_sentinel:
				1521
				1522	Sentinels
				1523	^^^^^^^^^
				1524
				1525	``ilist``\ s have another specialty that must be considered. To be a good
				1526	citizen in the C++ ecosystem, it needs to support the standard container
				1527	operations, such as ``begin`` and ``end`` iterators, etc. Also, the
				1528	``operator--`` must work correctly on the ``end`` iterator in the case of
				1529	non-empty ``ilist``\ s.
				1530
				1531	The only sensible solution to this problem is to allocate a so-called sentinel
				1532	along with the intrusive list, which serves as the ``end`` iterator, providing
				1533	the back-link to the last element. However conforming to the C++ convention it
				1534	is illegal to ``operator++`` beyond the sentinel and it also must not be
				1535	dereferenced.
				1536
				1537	These constraints allow for some implementation freedom to the ``ilist`` how to
				1538	allocate and store the sentinel. The corresponding policy is dictated by
				1539	``ilist_traits<T>``. By default a ``T`` gets heap-allocated whenever the need
				1540	for a sentinel arises.
				1541
				1542	While the default policy is sufficient in most cases, it may break down when
				1543	``T`` does not provide a default constructor. Also, in the case of many
				1544	instances of ``ilist``\ s, the memory overhead of the associated sentinels is
				1545	wasted. To alleviate the situation with numerous and voluminous
				1546	``T``-sentinels, sometimes a trick is employed, leading to ghostly sentinels.
				1547
				1548	Ghostly sentinels are obtained by specially-crafted ``ilist_traits<T>`` which
				1549	superpose the sentinel with the ``ilist`` instance in memory. Pointer
				1550	arithmetic is used to obtain the sentinel, which is relative to the ``ilist``'s
				1551	``this`` pointer. The ``ilist`` is augmented by an extra pointer, which serves
				1552	as the back-link of the sentinel. This is the only field in the ghostly
				1553	sentinel which can be legally accessed.
				1554
				1555	.. _dss_other:
				1556
				1557	Other Sequential Container options
				1558	^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
				1559
				1560	Other STL containers are available, such as ``std::string``.
				1561
				1562	There are also various STL adapter classes such as ``std::queue``,
				1563	``std::priority_queue``, ``std::stack``, etc. These provide simplified access
				1564	to an underlying container but don't affect the cost of the container itself.
				1565
				1566	.. _ds_string:
				1567
				1568	String-like containers
				1569	----------------------
				1570
				1571	There are a variety of ways to pass around and use strings in C and C++, and
				1572	LLVM adds a few new options to choose from. Pick the first option on this list
				1573	that will do what you need, they are ordered according to their relative cost.
				1574
Ed Maste	8ed40ce	2015-04-14 20:52:58 +0000	[diff] [blame]	1575	Note that it is generally preferred to not pass strings around as ``const
Sean Silva	beb15ca	2012-12-04 03:20:08 +0000	[diff] [blame]	1576	char*``'s. These have a number of problems, including the fact that they
				1577	cannot represent embedded nul ("\0") characters, and do not have a length
				1578	available efficiently. The general replacement for '``const char*``' is
				1579	StringRef.
				1580
				1581	For more information on choosing string containers for APIs, please see
				1582	:ref:`Passing Strings <string_apis>`.
				1583
				1584	.. _dss_stringref:
				1585
				1586	llvm/ADT/StringRef.h
				1587	^^^^^^^^^^^^^^^^^^^^
				1588
				1589	The StringRef class is a simple value class that contains a pointer to a
				1590	character and a length, and is quite related to the :ref:`ArrayRef
				1591	<dss_arrayref>` class (but specialized for arrays of characters). Because
				1592	StringRef carries a length with it, it safely handles strings with embedded nul
				1593	characters in it, getting the length does not require a strlen call, and it even
				1594	has very convenient APIs for slicing and dicing the character range that it
				1595	represents.
				1596
				1597	StringRef is ideal for passing simple strings around that are known to be live,
				1598	either because they are C string literals, std::string, a C array, or a
				1599	SmallVector. Each of these cases has an efficient implicit conversion to
				1600	StringRef, which doesn't result in a dynamic strlen being executed.
				1601
				1602	StringRef has a few major limitations which make more powerful string containers
				1603	useful:
				1604
				1605	#. You cannot directly convert a StringRef to a 'const char*' because there is
				1606	no way to add a trailing nul (unlike the .c_str() method on various stronger
				1607	classes).
				1608
				1609	#. StringRef doesn't own or keep alive the underlying string bytes.
				1610	As such it can easily lead to dangling pointers, and is not suitable for
				1611	embedding in datastructures in most cases (instead, use an std::string or
				1612	something like that).
				1613
				1614	#. For the same reason, StringRef cannot be used as the return value of a
				1615	method if the method "computes" the result string. Instead, use std::string.
				1616
				1617	#. StringRef's do not allow you to mutate the pointed-to string bytes and it
				1618	doesn't allow you to insert or remove bytes from the range. For editing
				1619	operations like this, it interoperates with the :ref:`Twine <dss_twine>`
				1620	class.
				1621
				1622	Because of its strengths and limitations, it is very common for a function to
				1623	take a StringRef and for a method on an object to return a StringRef that points
				1624	into some string that it owns.
				1625
				1626	.. _dss_twine:
				1627
				1628	llvm/ADT/Twine.h
				1629	^^^^^^^^^^^^^^^^
				1630
				1631	The Twine class is used as an intermediary datatype for APIs that want to take a
				1632	string that can be constructed inline with a series of concatenations. Twine
				1633	works by forming recursive instances of the Twine datatype (a simple value
				1634	object) on the stack as temporary objects, linking them together into a tree
				1635	which is then linearized when the Twine is consumed. Twine is only safe to use
				1636	as the argument to a function, and should always be a const reference, e.g.:
				1637
				1638	.. code-block:: c++
				1639
				1640	void foo(const Twine &T);
				1641	...
				1642	StringRef X = ...
				1643	unsigned i = ...
				1644	foo(X + "." + Twine(i));
				1645
				1646	This example forms a string like "blarg.42" by concatenating the values
				1647	together, and does not form intermediate strings containing "blarg" or "blarg.".
				1648
				1649	Because Twine is constructed with temporary objects on the stack, and because
				1650	these instances are destroyed at the end of the current statement, it is an
				1651	inherently dangerous API. For example, this simple variant contains undefined
				1652	behavior and will probably crash:
				1653
				1654	.. code-block:: c++
				1655
				1656	void foo(const Twine &T);
				1657	...
				1658	StringRef X = ...
				1659	unsigned i = ...
				1660	const Twine &Tmp = X + "." + Twine(i);
				1661	foo(Tmp);
				1662
				1663	... because the temporaries are destroyed before the call. That said, Twine's
				1664	are much more efficient than intermediate std::string temporaries, and they work
				1665	really well with StringRef. Just be aware of their limitations.
				1666
				1667	.. _dss_smallstring:
				1668
				1669	llvm/ADT/SmallString.h
				1670	^^^^^^^^^^^^^^^^^^^^^^
				1671
				1672	SmallString is a subclass of :ref:`SmallVector <dss_smallvector>` that adds some
				1673	convenience APIs like += that takes StringRef's. SmallString avoids allocating
				1674	memory in the case when the preallocated space is enough to hold its data, and
				1675	it calls back to general heap allocation when required. Since it owns its data,
				1676	it is very safe to use and supports full mutation of the string.
				1677
				1678	Like SmallVector's, the big downside to SmallString is their sizeof. While they
				1679	are optimized for small strings, they themselves are not particularly small.
				1680	This means that they work great for temporary scratch buffers on the stack, but
				1681	should not generally be put into the heap: it is very rare to see a SmallString
				1682	as the member of a frequently-allocated heap data structure or returned
				1683	by-value.
				1684
				1685	.. _dss_stdstring:
				1686
				1687	std::string
				1688	^^^^^^^^^^^
				1689
				1690	The standard C++ std::string class is a very general class that (like
				1691	SmallString) owns its underlying data. sizeof(std::string) is very reasonable
				1692	so it can be embedded into heap data structures and returned by-value. On the
				1693	other hand, std::string is highly inefficient for inline editing (e.g.
				1694	concatenating a bunch of stuff together) and because it is provided by the
				1695	standard library, its performance characteristics depend a lot of the host
				1696	standard library (e.g. libc++ and MSVC provide a highly optimized string class,
				1697	GCC contains a really slow implementation).
				1698
				1699	The major disadvantage of std::string is that almost every operation that makes
				1700	them larger can allocate memory, which is slow. As such, it is better to use
				1701	SmallVector or Twine as a scratch buffer, but then use std::string to persist
				1702	the result.
				1703
				1704	.. _ds_set:
				1705
				1706	Set-Like Containers (std::set, SmallSet, SetVector, etc)
				1707	--------------------------------------------------------
				1708
				1709	Set-like containers are useful when you need to canonicalize multiple values
				1710	into a single representation. There are several different choices for how to do
				1711	this, providing various trade-offs.
				1712
				1713	.. _dss_sortedvectorset:
				1714
				1715	A sorted 'vector'
				1716	^^^^^^^^^^^^^^^^^
				1717
				1718	If you intend to insert a lot of elements, then do a lot of queries, a great
				1719	approach is to use a vector (or other sequential container) with
				1720	std::sort+std::unique to remove duplicates. This approach works really well if
				1721	your usage pattern has these two distinct phases (insert then query), and can be
				1722	coupled with a good choice of :ref:`sequential container <ds_sequential>`.
				1723
				1724	This combination provides the several nice properties: the result data is
				1725	contiguous in memory (good for cache locality), has few allocations, is easy to
				1726	address (iterators in the final vector are just indices or pointers), and can be
Sean Silva	c9fbd23	2013-03-29 21:57:47 +0000	[diff] [blame]	1727	efficiently queried with a standard binary search (e.g.
				1728	``std::lower_bound``; if you want the whole range of elements comparing
				1729	equal, use ``std::equal_range``).
Sean Silva	beb15ca	2012-12-04 03:20:08 +0000	[diff] [blame]	1730
				1731	.. _dss_smallset:
				1732
				1733	llvm/ADT/SmallSet.h
				1734	^^^^^^^^^^^^^^^^^^^
				1735
				1736	If you have a set-like data structure that is usually small and whose elements
				1737	are reasonably small, a ``SmallSet<Type, N>`` is a good choice. This set has
				1738	space for N elements in place (thus, if the set is dynamically smaller than N,
				1739	no malloc traffic is required) and accesses them with a simple linear search.
Artyom Skrobov	6264115	2015-05-19 10:21:12 +0000	[diff] [blame]	1740	When the set grows beyond N elements, it allocates a more expensive
Sean Silva	beb15ca	2012-12-04 03:20:08 +0000	[diff] [blame]	1741	representation that guarantees efficient access (for most types, it falls back
Artyom Skrobov	6264115	2015-05-19 10:21:12 +0000	[diff] [blame]	1742	to :ref:`std::set <dss_set>`, but for pointers it uses something far better,
				1743	:ref:`SmallPtrSet <dss_smallptrset>`.
Sean Silva	beb15ca	2012-12-04 03:20:08 +0000	[diff] [blame]	1744
				1745	The magic of this class is that it handles small sets extremely efficiently, but
				1746	gracefully handles extremely large sets without loss of efficiency. The
				1747	drawback is that the interface is quite small: it supports insertion, queries
				1748	and erasing, but does not support iteration.
				1749
				1750	.. _dss_smallptrset:
				1751
				1752	llvm/ADT/SmallPtrSet.h
				1753	^^^^^^^^^^^^^^^^^^^^^^
				1754
Artyom Skrobov	6264115	2015-05-19 10:21:12 +0000	[diff] [blame]	1755	``SmallPtrSet`` has all the advantages of ``SmallSet`` (and a ``SmallSet`` of
Sean Silva	beb15ca	2012-12-04 03:20:08 +0000	[diff] [blame]	1756	pointers is transparently implemented with a ``SmallPtrSet``), but also supports
Artyom Skrobov	6264115	2015-05-19 10:21:12 +0000	[diff] [blame]	1757	iterators. If more than N insertions are performed, a single quadratically
Sean Silva	beb15ca	2012-12-04 03:20:08 +0000	[diff] [blame]	1758	probed hash table is allocated and grows as needed, providing extremely
				1759	efficient access (constant time insertion/deleting/queries with low constant
				1760	factors) and is very stingy with malloc traffic.
				1761
Artyom Skrobov	6264115	2015-05-19 10:21:12 +0000	[diff] [blame]	1762	Note that, unlike :ref:`std::set <dss_set>`, the iterators of ``SmallPtrSet``
				1763	are invalidated whenever an insertion occurs. Also, the values visited by the
				1764	iterators are not visited in sorted order.
				1765
				1766	.. _dss_stringset:
				1767
				1768	llvm/ADT/StringSet.h
				1769	^^^^^^^^^^^^^^^^^^^^
				1770
				1771	``StringSet`` is a thin wrapper around :ref:`StringMap\<char\> <dss_stringmap>`,
				1772	and it allows efficient storage and retrieval of unique strings.
				1773
Sylvestre Ledru	84666a1	2016-02-14 20:16:22 +0000	[diff] [blame]	1774	Functionally analogous to ``SmallSet<StringRef>``, ``StringSet`` also supports
Artyom Skrobov	6264115	2015-05-19 10:21:12 +0000	[diff] [blame]	1775	iteration. (The iterator dereferences to a ``StringMapEntry<char>``, so you
				1776	need to call ``i->getKey()`` to access the item of the StringSet.) On the
				1777	other hand, ``StringSet`` doesn't support range-insertion and
				1778	copy-construction, which :ref:`SmallSet <dss_smallset>` and :ref:`SmallPtrSet
				1779	<dss_smallptrset>` do support.
Sean Silva	beb15ca	2012-12-04 03:20:08 +0000	[diff] [blame]	1780
				1781	.. _dss_denseset:
				1782
				1783	llvm/ADT/DenseSet.h
				1784	^^^^^^^^^^^^^^^^^^^
				1785
				1786	DenseSet is a simple quadratically probed hash table. It excels at supporting
				1787	small values: it uses a single allocation to hold all of the pairs that are
				1788	currently inserted in the set. DenseSet is a great way to unique small values
				1789	that are not simple pointers (use :ref:`SmallPtrSet <dss_smallptrset>` for
				1790	pointers). Note that DenseSet has the same requirements for the value type that
				1791	:ref:`DenseMap <dss_densemap>` has.
				1792
				1793	.. _dss_sparseset:
				1794
				1795	llvm/ADT/SparseSet.h
				1796	^^^^^^^^^^^^^^^^^^^^
				1797
				1798	SparseSet holds a small number of objects identified by unsigned keys of
				1799	moderate size. It uses a lot of memory, but provides operations that are almost
				1800	as fast as a vector. Typical keys are physical registers, virtual registers, or
				1801	numbered basic blocks.
				1802
				1803	SparseSet is useful for algorithms that need very fast clear/find/insert/erase
				1804	and fast iteration over small sets. It is not intended for building composite
				1805	data structures.
				1806
Michael Ilseman	830875b	2013-01-21 21:46:32 +0000	[diff] [blame]	1807	.. _dss_sparsemultiset:
				1808
				1809	llvm/ADT/SparseMultiSet.h
				1810	^^^^^^^^^^^^^^^^^^^^^^^^^^^^
				1811
				1812	SparseMultiSet adds multiset behavior to SparseSet, while retaining SparseSet's
				1813	desirable attributes. Like SparseSet, it typically uses a lot of memory, but
				1814	provides operations that are almost as fast as a vector. Typical keys are
				1815	physical registers, virtual registers, or numbered basic blocks.
				1816
				1817	SparseMultiSet is useful for algorithms that need very fast
				1818	clear/find/insert/erase of the entire collection, and iteration over sets of
				1819	elements sharing a key. It is often a more efficient choice than using composite
				1820	data structures (e.g. vector-of-vectors, map-of-vectors). It is not intended for
				1821	building composite data structures.
				1822
Sean Silva	beb15ca	2012-12-04 03:20:08 +0000	[diff] [blame]	1823	.. _dss_FoldingSet:
				1824
				1825	llvm/ADT/FoldingSet.h
				1826	^^^^^^^^^^^^^^^^^^^^^
				1827
				1828	FoldingSet is an aggregate class that is really good at uniquing
				1829	expensive-to-create or polymorphic objects. It is a combination of a chained
				1830	hash table with intrusive links (uniqued objects are required to inherit from
				1831	FoldingSetNode) that uses :ref:`SmallVector <dss_smallvector>` as part of its ID
				1832	process.
				1833
				1834	Consider a case where you want to implement a "getOrCreateFoo" method for a
				1835	complex object (for example, a node in the code generator). The client has a
				1836	description of what it wants to generate (it knows the opcode and all the
				1837	operands), but we don't want to 'new' a node, then try inserting it into a set
				1838	only to find out it already exists, at which point we would have to delete it
				1839	and return the node that already exists.
				1840
				1841	To support this style of client, FoldingSet perform a query with a
				1842	FoldingSetNodeID (which wraps SmallVector) that can be used to describe the
				1843	element that we want to query for. The query either returns the element
				1844	matching the ID or it returns an opaque ID that indicates where insertion should
				1845	take place. Construction of the ID usually does not require heap traffic.
				1846
				1847	Because FoldingSet uses intrusive links, it can support polymorphic objects in
				1848	the set (for example, you can have SDNode instances mixed with LoadSDNodes).
				1849	Because the elements are individually allocated, pointers to the elements are
				1850	stable: inserting or removing elements does not invalidate any pointers to other
				1851	elements.
				1852
				1853	.. _dss_set:
				1854
				1855	<set>
				1856	^^^^^
				1857
				1858	``std::set`` is a reasonable all-around set class, which is decent at many
				1859	things but great at nothing. std::set allocates memory for each element
				1860	inserted (thus it is very malloc intensive) and typically stores three pointers
				1861	per element in the set (thus adding a large amount of per-element space
				1862	overhead). It offers guaranteed log(n) performance, which is not particularly
				1863	fast from a complexity standpoint (particularly if the elements of the set are
				1864	expensive to compare, like strings), and has extremely high constant factors for
				1865	lookup, insertion and removal.
				1866
				1867	The advantages of std::set are that its iterators are stable (deleting or
				1868	inserting an element from the set does not affect iterators or pointers to other
				1869	elements) and that iteration over the set is guaranteed to be in sorted order.
				1870	If the elements in the set are large, then the relative overhead of the pointers
				1871	and malloc traffic is not a big deal, but if the elements of the set are small,
				1872	std::set is almost never a good choice.
				1873
				1874	.. _dss_setvector:
				1875
				1876	llvm/ADT/SetVector.h
				1877	^^^^^^^^^^^^^^^^^^^^
				1878
				1879	LLVM's ``SetVector<Type>`` is an adapter class that combines your choice of a
				1880	set-like container along with a :ref:`Sequential Container <ds_sequential>` The
				1881	important property that this provides is efficient insertion with uniquing
				1882	(duplicate elements are ignored) with iteration support. It implements this by
				1883	inserting elements into both a set-like container and the sequential container,
				1884	using the set-like container for uniquing and the sequential container for
				1885	iteration.
				1886
				1887	The difference between SetVector and other sets is that the order of iteration
				1888	is guaranteed to match the order of insertion into the SetVector. This property
				1889	is really important for things like sets of pointers. Because pointer values
				1890	are non-deterministic (e.g. vary across runs of the program on different
				1891	machines), iterating over the pointers in the set will not be in a well-defined
				1892	order.
				1893
				1894	The drawback of SetVector is that it requires twice as much space as a normal
				1895	set and has the sum of constant factors from the set-like container and the
				1896	sequential container that it uses. Use it only if you need to iterate over
				1897	the elements in a deterministic order. SetVector is also expensive to delete
Paul Robinson	687915f	2013-11-14 18:47:23 +0000	[diff] [blame]	1898	elements out of (linear time), unless you use its "pop_back" method, which is
Sean Silva	beb15ca	2012-12-04 03:20:08 +0000	[diff] [blame]	1899	faster.
				1900
				1901	``SetVector`` is an adapter class that defaults to using ``std::vector`` and a
				1902	size 16 ``SmallSet`` for the underlying containers, so it is quite expensive.
				1903	However, ``"llvm/ADT/SetVector.h"`` also provides a ``SmallSetVector`` class,
				1904	which defaults to using a ``SmallVector`` and ``SmallSet`` of a specified size.
				1905	If you use this, and if your sets are dynamically smaller than ``N``, you will
				1906	save a lot of heap traffic.
				1907
				1908	.. _dss_uniquevector:
				1909
				1910	llvm/ADT/UniqueVector.h
				1911	^^^^^^^^^^^^^^^^^^^^^^^
				1912
				1913	UniqueVector is similar to :ref:`SetVector <dss_setvector>` but it retains a
				1914	unique ID for each element inserted into the set. It internally contains a map
				1915	and a vector, and it assigns a unique ID for each value inserted into the set.
				1916
				1917	UniqueVector is very expensive: its cost is the sum of the cost of maintaining
				1918	both the map and vector, it has high complexity, high constant factors, and
				1919	produces a lot of malloc traffic. It should be avoided.
				1920
				1921	.. _dss_immutableset:
				1922
				1923	llvm/ADT/ImmutableSet.h
				1924	^^^^^^^^^^^^^^^^^^^^^^^
				1925
				1926	ImmutableSet is an immutable (functional) set implementation based on an AVL
				1927	tree. Adding or removing elements is done through a Factory object and results
				1928	in the creation of a new ImmutableSet object. If an ImmutableSet already exists
				1929	with the given contents, then the existing one is returned; equality is compared
				1930	with a FoldingSetNodeID. The time and space complexity of add or remove
				1931	operations is logarithmic in the size of the original set.
				1932
				1933	There is no method for returning an element of the set, you can only check for
				1934	membership.
				1935
				1936	.. _dss_otherset:
				1937
				1938	Other Set-Like Container Options
				1939	^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
				1940
				1941	The STL provides several other options, such as std::multiset and the various
				1942	"hash_set" like containers (whether from C++ TR1 or from the SGI library). We
				1943	never use hash_set and unordered_set because they are generally very expensive
				1944	(each insertion requires a malloc) and very non-portable.
				1945
				1946	std::multiset is useful if you're not interested in elimination of duplicates,
Artyom Skrobov	6264115	2015-05-19 10:21:12 +0000	[diff] [blame]	1947	but has all the drawbacks of :ref:`std::set <dss_set>`. A sorted vector
				1948	(where you don't delete duplicate entries) or some other approach is almost
Aaron Ballman	9f154f6	2015-07-29 15:57:49 +0000	[diff] [blame]	1949	always better.
Sean Silva	beb15ca	2012-12-04 03:20:08 +0000	[diff] [blame]	1950
				1951	.. _ds_map:
				1952
				1953	Map-Like Containers (std::map, DenseMap, etc)
				1954	---------------------------------------------
				1955
				1956	Map-like containers are useful when you want to associate data to a key. As
				1957	usual, there are a lot of different ways to do this. :)
				1958
				1959	.. _dss_sortedvectormap:
				1960
				1961	A sorted 'vector'
				1962	^^^^^^^^^^^^^^^^^
				1963
				1964	If your usage pattern follows a strict insert-then-query approach, you can
				1965	trivially use the same approach as :ref:`sorted vectors for set-like containers
				1966	<dss_sortedvectorset>`. The only difference is that your query function (which
				1967	uses std::lower_bound to get efficient log(n) lookup) should only compare the
				1968	key, not both the key and value. This yields the same advantages as sorted
				1969	vectors for sets.
				1970
				1971	.. _dss_stringmap:
				1972
				1973	llvm/ADT/StringMap.h
				1974	^^^^^^^^^^^^^^^^^^^^
				1975
				1976	Strings are commonly used as keys in maps, and they are difficult to support
				1977	efficiently: they are variable length, inefficient to hash and compare when
				1978	long, expensive to copy, etc. StringMap is a specialized container designed to
				1979	cope with these issues. It supports mapping an arbitrary range of bytes to an
				1980	arbitrary other object.
				1981
				1982	The StringMap implementation uses a quadratically-probed hash table, where the
				1983	buckets store a pointer to the heap allocated entries (and some other stuff).
				1984	The entries in the map must be heap allocated because the strings are variable
				1985	length. The string data (key) and the element object (value) are stored in the
				1986	same allocation with the string data immediately after the element object.
				1987	This container guarantees the "``(char*)(&Value+1)``" points to the key string
				1988	for a value.
				1989
				1990	The StringMap is very fast for several reasons: quadratic probing is very cache
				1991	efficient for lookups, the hash value of strings in buckets is not recomputed
				1992	when looking up an element, StringMap rarely has to touch the memory for
				1993	unrelated objects when looking up a value (even when hash collisions happen),
				1994	hash table growth does not recompute the hash values for strings already in the
				1995	table, and each pair in the map is store in a single allocation (the string data
				1996	is stored in the same allocation as the Value of a pair).
				1997
				1998	StringMap also provides query methods that take byte ranges, so it only ever
				1999	copies a string if a value is inserted into the table.
				2000
				2001	StringMap iteratation order, however, is not guaranteed to be deterministic, so
				2002	any uses which require that should instead use a std::map.
				2003
				2004	.. _dss_indexmap:
				2005
				2006	llvm/ADT/IndexedMap.h
				2007	^^^^^^^^^^^^^^^^^^^^^
				2008
				2009	IndexedMap is a specialized container for mapping small dense integers (or
				2010	values that can be mapped to small dense integers) to some other type. It is
				2011	internally implemented as a vector with a mapping function that maps the keys
				2012	to the dense integer range.
				2013
				2014	This is useful for cases like virtual registers in the LLVM code generator: they
				2015	have a dense mapping that is offset by a compile-time constant (the first
				2016	virtual register ID).
				2017
				2018	.. _dss_densemap:
				2019
				2020	llvm/ADT/DenseMap.h
				2021	^^^^^^^^^^^^^^^^^^^
				2022
				2023	DenseMap is a simple quadratically probed hash table. It excels at supporting
				2024	small keys and values: it uses a single allocation to hold all of the pairs
				2025	that are currently inserted in the map. DenseMap is a great way to map
				2026	pointers to pointers, or map other small types to each other.
				2027
				2028	There are several aspects of DenseMap that you should be aware of, however.
				2029	The iterators in a DenseMap are invalidated whenever an insertion occurs,
				2030	unlike map. Also, because DenseMap allocates space for a large number of
				2031	key/value pairs (it starts with 64 by default), it will waste a lot of space if
				2032	your keys or values are large. Finally, you must implement a partial
				2033	specialization of DenseMapInfo for the key that you want, if it isn't already
				2034	supported. This is required to tell DenseMap about two special marker values
				2035	(which can never be inserted into the map) that it needs internally.
				2036
				2037	DenseMap's find_as() method supports lookup operations using an alternate key
				2038	type. This is useful in cases where the normal key type is expensive to
				2039	construct, but cheap to compare against. The DenseMapInfo is responsible for
				2040	defining the appropriate comparison and hashing methods for each alternate key
				2041	type used.
				2042
				2043	.. _dss_valuemap:
				2044
Chandler Carruth	a4ea269	2014-03-04 11:26:31 +0000	[diff] [blame]	2045	llvm/IR/ValueMap.h
Sean Silva	beb15ca	2012-12-04 03:20:08 +0000	[diff] [blame]	2046	^^^^^^^^^^^^^^^^^^^
				2047
				2048	ValueMap is a wrapper around a :ref:`DenseMap <dss_densemap>` mapping
				2049	``Value*``\ s (or subclasses) to another type. When a Value is deleted or
				2050	RAUW'ed, ValueMap will update itself so the new version of the key is mapped to
				2051	the same value, just as if the key were a WeakVH. You can configure exactly how
				2052	this happens, and what else happens on these two events, by passing a ``Config``
				2053	parameter to the ValueMap template.
				2054
				2055	.. _dss_intervalmap:
				2056
				2057	llvm/ADT/IntervalMap.h
				2058	^^^^^^^^^^^^^^^^^^^^^^
				2059
				2060	IntervalMap is a compact map for small keys and values. It maps key intervals
				2061	instead of single keys, and it will automatically coalesce adjacent intervals.
Hans Wennborg	8888d5b	2015-01-17 03:19:21 +0000	[diff] [blame]	2062	When the map only contains a few intervals, they are stored in the map object
Sean Silva	beb15ca	2012-12-04 03:20:08 +0000	[diff] [blame]	2063	itself to avoid allocations.
				2064
				2065	The IntervalMap iterators are quite big, so they should not be passed around as
				2066	STL iterators. The heavyweight iterators allow a smaller data structure.
				2067
				2068	.. _dss_map:
				2069
				2070	<map>
				2071	^^^^^
				2072
				2073	std::map has similar characteristics to :ref:`std::set <dss_set>`: it uses a
				2074	single allocation per pair inserted into the map, it offers log(n) lookup with
				2075	an extremely large constant factor, imposes a space penalty of 3 pointers per
				2076	pair in the map, etc.
				2077
				2078	std::map is most useful when your keys or values are very large, if you need to
				2079	iterate over the collection in sorted order, or if you need stable iterators
				2080	into the map (i.e. they don't get invalidated if an insertion or deletion of
				2081	another element takes place).
				2082
				2083	.. _dss_mapvector:
				2084
				2085	llvm/ADT/MapVector.h
				2086	^^^^^^^^^^^^^^^^^^^^
				2087
				2088	``MapVector<KeyT,ValueT>`` provides a subset of the DenseMap interface. The
				2089	main difference is that the iteration order is guaranteed to be the insertion
				2090	order, making it an easy (but somewhat expensive) solution for non-deterministic
				2091	iteration over maps of pointers.
				2092
				2093	It is implemented by mapping from key to an index in a vector of key,value
Duncan P. N. Exon Smith	f51601c	2014-07-15 20:24:56 +0000	[diff] [blame]	2094	pairs. This provides fast lookup and iteration, but has two main drawbacks:
				2095	the key is stored twice and removing elements takes linear time. If it is
				2096	necessary to remove elements, it's best to remove them in bulk using
				2097	``remove_if()``.
Sean Silva	beb15ca	2012-12-04 03:20:08 +0000	[diff] [blame]	2098
				2099	.. _dss_inteqclasses:
				2100
				2101	llvm/ADT/IntEqClasses.h
				2102	^^^^^^^^^^^^^^^^^^^^^^^
				2103
				2104	IntEqClasses provides a compact representation of equivalence classes of small
				2105	integers. Initially, each integer in the range 0..n-1 has its own equivalence
				2106	class. Classes can be joined by passing two class representatives to the
				2107	join(a, b) method. Two integers are in the same class when findLeader() returns
				2108	the same representative.
				2109
				2110	Once all equivalence classes are formed, the map can be compressed so each
				2111	integer 0..n-1 maps to an equivalence class number in the range 0..m-1, where m
				2112	is the total number of equivalence classes. The map must be uncompressed before
				2113	it can be edited again.
				2114
				2115	.. _dss_immutablemap:
				2116
				2117	llvm/ADT/ImmutableMap.h
				2118	^^^^^^^^^^^^^^^^^^^^^^^
				2119
				2120	ImmutableMap is an immutable (functional) map implementation based on an AVL
				2121	tree. Adding or removing elements is done through a Factory object and results
				2122	in the creation of a new ImmutableMap object. If an ImmutableMap already exists
				2123	with the given key set, then the existing one is returned; equality is compared
				2124	with a FoldingSetNodeID. The time and space complexity of add or remove
				2125	operations is logarithmic in the size of the original map.
				2126
				2127	.. _dss_othermap:
				2128
				2129	Other Map-Like Container Options
				2130	^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
				2131
				2132	The STL provides several other options, such as std::multimap and the various
				2133	"hash_map" like containers (whether from C++ TR1 or from the SGI library). We
				2134	never use hash_set and unordered_set because they are generally very expensive
				2135	(each insertion requires a malloc) and very non-portable.
				2136
				2137	std::multimap is useful if you want to map a key to multiple values, but has all
				2138	the drawbacks of std::map. A sorted vector or some other approach is almost
				2139	always better.
				2140
				2141	.. _ds_bit:
				2142
				2143	Bit storage containers (BitVector, SparseBitVector)
				2144	---------------------------------------------------
				2145
				2146	Unlike the other containers, there are only two bit storage containers, and
				2147	choosing when to use each is relatively straightforward.
				2148
				2149	One additional option is ``std::vector<bool>``: we discourage its use for two
				2150	reasons 1) the implementation in many common compilers (e.g. commonly
				2151	available versions of GCC) is extremely inefficient and 2) the C++ standards
				2152	committee is likely to deprecate this container and/or change it significantly
				2153	somehow. In any case, please don't use it.
				2154
				2155	.. _dss_bitvector:
				2156
				2157	BitVector
				2158	^^^^^^^^^
				2159
				2160	The BitVector container provides a dynamic size set of bits for manipulation.
				2161	It supports individual bit setting/testing, as well as set operations. The set
				2162	operations take time O(size of bitvector), but operations are performed one word
				2163	at a time, instead of one bit at a time. This makes the BitVector very fast for
				2164	set operations compared to other containers. Use the BitVector when you expect
				2165	the number of set bits to be high (i.e. a dense set).
				2166
				2167	.. _dss_smallbitvector:
				2168
				2169	SmallBitVector
				2170	^^^^^^^^^^^^^^
				2171
				2172	The SmallBitVector container provides the same interface as BitVector, but it is
				2173	optimized for the case where only a small number of bits, less than 25 or so,
				2174	are needed. It also transparently supports larger bit counts, but slightly less
				2175	efficiently than a plain BitVector, so SmallBitVector should only be used when
				2176	larger counts are rare.
				2177
				2178	At this time, SmallBitVector does not support set operations (and, or, xor), and
				2179	its operator[] does not provide an assignable lvalue.
				2180
				2181	.. _dss_sparsebitvector:
				2182
				2183	SparseBitVector
				2184	^^^^^^^^^^^^^^^
				2185
				2186	The SparseBitVector container is much like BitVector, with one major difference:
				2187	Only the bits that are set, are stored. This makes the SparseBitVector much
				2188	more space efficient than BitVector when the set is sparse, as well as making
				2189	set operations O(number of set bits) instead of O(size of universe). The
				2190	downside to the SparseBitVector is that setting and testing of random bits is
				2191	O(N), and on large SparseBitVectors, this can be slower than BitVector. In our
				2192	implementation, setting or testing bits in sorted order (either forwards or
				2193	reverse) is O(1) worst case. Testing and setting bits within 128 bits (depends
				2194	on size) of the current bit is also O(1). As a general statement,
				2195	testing/setting bits in a SparseBitVector is O(distance away from last set bit).
				2196
David Blaikie	063b272	2016-12-20 17:33:58 +0000	[diff] [blame]	2197	.. _debugging:
				2198
				2199	Debugging
				2200	=========
				2201
				2202	A handful of `GDB pretty printers
				2203	<https://sourceware.org/gdb/onlinedocs/gdb/Pretty-Printing.html>`__ are
				2204	provided for some of the core LLVM libraries. To use them, execute the
				2205	following (or add it to your ``~/.gdbinit``)::
				2206
				2207	source /path/to/llvm/src/utils/gdb-scripts/prettyprinters.py
				2208
				2209	It also might be handy to enable the `print pretty
David Blaikie	d21e08e	2016-12-20 17:43:48 +0000	[diff] [blame^]	2210	<http://ftp.gnu.org/old-gnu/Manuals/gdb/html_node/gdb_57.html>`__ option to
David Blaikie	063b272	2016-12-20 17:33:58 +0000	[diff] [blame]	2211	avoid data structures being printed as a big block of text.
				2212
Sean Silva	beb15ca	2012-12-04 03:20:08 +0000	[diff] [blame]	2213	.. _common:
				2214
				2215	Helpful Hints for Common Operations
				2216	===================================
				2217
				2218	This section describes how to perform some very simple transformations of LLVM
				2219	code. This is meant to give examples of common idioms used, showing the
				2220	practical side of LLVM transformations.
				2221
				2222	Because this is a "how-to" section, you should also read about the main classes
				2223	that you will be working with. The :ref:`Core LLVM Class Hierarchy Reference
				2224	<coreclasses>` contains details and descriptions of the main classes that you
				2225	should know about.
				2226
				2227	.. _inspection:
				2228
				2229	Basic Inspection and Traversal Routines
				2230	---------------------------------------
				2231
				2232	The LLVM compiler infrastructure have many different data structures that may be
				2233	traversed. Following the example of the C++ standard template library, the
				2234	techniques used to traverse these various data structures are all basically the
				2235	same. For a enumerable sequence of values, the ``XXXbegin()`` function (or
				2236	method) returns an iterator to the start of the sequence, the ``XXXend()``
				2237	function returns an iterator pointing to one past the last valid element of the
				2238	sequence, and there is some ``XXXiterator`` data type that is common between the
				2239	two operations.
				2240
				2241	Because the pattern for iteration is common across many different aspects of the
				2242	program representation, the standard template library algorithms may be used on
				2243	them, and it is easier to remember how to iterate. First we show a few common
				2244	examples of the data structures that need to be traversed. Other data
				2245	structures are traversed in very similar ways.
				2246
				2247	.. _iterate_function:
				2248
				2249	Iterating over the ``BasicBlock`` in a ``Function``
				2250	^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
				2251
				2252	It's quite common to have a ``Function`` instance that you'd like to transform
				2253	in some way; in particular, you'd like to manipulate its ``BasicBlock``\ s. To
				2254	facilitate this, you'll need to iterate over all of the ``BasicBlock``\ s that
				2255	constitute the ``Function``. The following is an example that prints the name
				2256	of a ``BasicBlock`` and the number of ``Instruction``\ s it contains:
				2257
				2258	.. code-block:: c++
				2259
				2260	// func is a pointer to a Function instance
				2261	for (Function::iterator i = func->begin(), e = func->end(); i != e; ++i)
				2262	// Print out the name of the basic block if it has one, and then the
				2263	// number of instructions that it contains
				2264	errs() << "Basic block (name=" << i->getName() << ") has "
				2265	<< i->size() << " instructions.\n";
				2266
				2267	Note that i can be used as if it were a pointer for the purposes of invoking
				2268	member functions of the ``Instruction`` class. This is because the indirection
				2269	operator is overloaded for the iterator classes. In the above code, the
				2270	expression ``i->size()`` is exactly equivalent to ``(*i).size()`` just like
				2271	you'd expect.
				2272
				2273	.. _iterate_basicblock:
				2274
				2275	Iterating over the ``Instruction`` in a ``BasicBlock``
				2276	^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
				2277
				2278	Just like when dealing with ``BasicBlock``\ s in ``Function``\ s, it's easy to
				2279	iterate over the individual instructions that make up ``BasicBlock``\ s. Here's
				2280	a code snippet that prints out each instruction in a ``BasicBlock``:
				2281
				2282	.. code-block:: c++
				2283
				2284	// blk is a pointer to a BasicBlock instance
				2285	for (BasicBlock::iterator i = blk->begin(), e = blk->end(); i != e; ++i)
				2286	// The next statement works since operator<<(ostream&,...)
				2287	// is overloaded for Instruction&
				2288	errs() << *i << "\n";
				2289
				2290
				2291	However, this isn't really the best way to print out the contents of a
				2292	``BasicBlock``! Since the ostream operators are overloaded for virtually
				2293	anything you'll care about, you could have just invoked the print routine on the
				2294	basic block itself: ``errs() << *blk << "\n";``.
				2295
				2296	.. _iterate_insiter:
				2297
				2298	Iterating over the ``Instruction`` in a ``Function``
				2299	^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
				2300
				2301	If you're finding that you commonly iterate over a ``Function``'s
				2302	``BasicBlock``\ s and then that ``BasicBlock``'s ``Instruction``\ s,
				2303	``InstIterator`` should be used instead. You'll need to include
Yaron Keren	d9c0bed	2014-05-03 11:30:49 +0000	[diff] [blame]	2304	``llvm/IR/InstIterator.h`` (`doxygen
Yaron Keren	81bb415	2014-05-03 12:06:13 +0000	[diff] [blame]	2305	<http://llvm.org/doxygen/InstIterator_8h.html>`__) and then instantiate
Sean Silva	beb15ca	2012-12-04 03:20:08 +0000	[diff] [blame]	2306	``InstIterator``\ s explicitly in your code. Here's a small example that shows
				2307	how to dump all instructions in a function to the standard error stream:
				2308
				2309	.. code-block:: c++
				2310
Yaron Keren	d9c0bed	2014-05-03 11:30:49 +0000	[diff] [blame]	2311	#include "llvm/IR/InstIterator.h"
Sean Silva	beb15ca	2012-12-04 03:20:08 +0000	[diff] [blame]	2312
				2313	// F is a pointer to a Function instance
				2314	for (inst_iterator I = inst_begin(F), E = inst_end(F); I != E; ++I)
				2315	errs() << *I << "\n";
				2316
				2317	Easy, isn't it? You can also use ``InstIterator``\ s to fill a work list with
				2318	its initial contents. For example, if you wanted to initialize a work list to
				2319	contain all instructions in a ``Function`` F, all you would need to do is
				2320	something like:
				2321
				2322	.. code-block:: c++
				2323
				2324	std::set<Instruction*> worklist;
				2325	// or better yet, SmallPtrSet<Instruction*, 64> worklist;
				2326
				2327	for (inst_iterator I = inst_begin(F), E = inst_end(F); I != E; ++I)
				2328	worklist.insert(&*I);
				2329
				2330	The STL set ``worklist`` would now contain all instructions in the ``Function``
				2331	pointed to by F.
				2332
				2333	.. _iterate_convert:
				2334
				2335	Turning an iterator into a class pointer (and vice-versa)
				2336	^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
				2337
				2338	Sometimes, it'll be useful to grab a reference (or pointer) to a class instance
				2339	when all you've got at hand is an iterator. Well, extracting a reference or a
				2340	pointer from an iterator is very straight-forward. Assuming that ``i`` is a
				2341	``BasicBlock::iterator`` and ``j`` is a ``BasicBlock::const_iterator``:
				2342
				2343	.. code-block:: c++
				2344
				2345	Instruction& inst = *i; // Grab reference to instruction reference
				2346	Instruction* pinst = &*i; // Grab pointer to instruction reference
				2347	const Instruction& inst = *j;
				2348
				2349	However, the iterators you'll be working with in the LLVM framework are special:
				2350	they will automatically convert to a ptr-to-instance type whenever they need to.
Vedant Kumar	a34bdfa	2016-03-23 05:18:50 +0000	[diff] [blame]	2351	Instead of dereferencing the iterator and then taking the address of the result,
Sean Silva	beb15ca	2012-12-04 03:20:08 +0000	[diff] [blame]	2352	you can simply assign the iterator to the proper pointer type and you get the
				2353	dereference and address-of operation as a result of the assignment (behind the
Charlie Turner	2ac115e	2015-04-16 17:01:23 +0000	[diff] [blame]	2354	scenes, this is a result of overloading casting mechanisms). Thus the second
				2355	line of the last example,
Sean Silva	beb15ca	2012-12-04 03:20:08 +0000	[diff] [blame]	2356
				2357	.. code-block:: c++
				2358
				2359	Instruction pinst = &i;
				2360
				2361	is semantically equivalent to
				2362
				2363	.. code-block:: c++
				2364
				2365	Instruction *pinst = i;
				2366
				2367	It's also possible to turn a class pointer into the corresponding iterator, and
				2368	this is a constant time operation (very efficient). The following code snippet
				2369	illustrates use of the conversion constructors provided by LLVM iterators. By
				2370	using these, you can explicitly grab the iterator of something without actually
				2371	obtaining it via iteration over some structure:
				2372
				2373	.. code-block:: c++
				2374
				2375	void printNextInstruction(Instruction* inst) {
				2376	BasicBlock::iterator it(inst);
				2377	++it; // After this line, it refers to the instruction after *inst
				2378	if (it != inst->getParent()->end()) errs() << *it << "\n";
				2379	}
				2380
				2381	Unfortunately, these implicit conversions come at a cost; they prevent these
				2382	iterators from conforming to standard iterator conventions, and thus from being
				2383	usable with standard algorithms and containers. For example, they prevent the
				2384	following code, where ``B`` is a ``BasicBlock``, from compiling:
				2385
				2386	.. code-block:: c++
				2387
				2388	llvm::SmallVector<llvm::Instruction *, 16>(B->begin(), B->end());
				2389
				2390	Because of this, these implicit conversions may be removed some day, and
				2391	``operator*`` changed to return a pointer instead of a reference.
				2392
				2393	.. _iterate_complex:
				2394
				2395	Finding call sites: a slightly more complex example
				2396	^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
				2397
				2398	Say that you're writing a FunctionPass and would like to count all the locations
				2399	in the entire module (that is, across every ``Function``) where a certain
				2400	function (i.e., some ``Function *``) is already in scope. As you'll learn
				2401	later, you may want to use an ``InstVisitor`` to accomplish this in a much more
				2402	straight-forward manner, but this example will allow us to explore how you'd do
				2403	it if you didn't have ``InstVisitor`` around. In pseudo-code, this is what we
				2404	want to do:
				2405
				2406	.. code-block:: none
				2407
				2408	initialize callCounter to zero
				2409	for each Function f in the Module
				2410	for each BasicBlock b in f
				2411	for each Instruction i in b
				2412	if (i is a CallInst and calls the given function)
				2413	increment callCounter
				2414
				2415	And the actual code is (remember, because we're writing a ``FunctionPass``, our
				2416	``FunctionPass``-derived class simply has to override the ``runOnFunction``
				2417	method):
				2418
				2419	.. code-block:: c++
				2420
				2421	Function* targetFunc = ...;
				2422
				2423	class OurFunctionPass : public FunctionPass {
				2424	public:
				2425	OurFunctionPass(): callCounter(0) { }
				2426
				2427	virtual runOnFunction(Function& F) {
				2428	for (Function::iterator b = F.begin(), be = F.end(); b != be; ++b) {
				2429	for (BasicBlock::iterator i = b->begin(), ie = b->end(); i != ie; ++i) {
				2430	if (CallInst* callInst = dyn_cast<CallInst>(&*i)) {
				2431	// We know we've encountered a call instruction, so we
				2432	// need to determine if it's a call to the
				2433	// function pointed to by m_func or not.
				2434	if (callInst->getCalledFunction() == targetFunc)
				2435	++callCounter;
				2436	}
				2437	}
				2438	}
				2439	}
				2440
				2441	private:
				2442	unsigned callCounter;
				2443	};
				2444
				2445	.. _calls_and_invokes:
				2446
				2447	Treating calls and invokes the same way
				2448	^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
				2449
				2450	You may have noticed that the previous example was a bit oversimplified in that
				2451	it did not deal with call sites generated by 'invoke' instructions. In this,
				2452	and in other situations, you may find that you want to treat ``CallInst``\ s and
				2453	``InvokeInst``\ s the same way, even though their most-specific common base
				2454	class is ``Instruction``, which includes lots of less closely-related things.
				2455	For these cases, LLVM provides a handy wrapper class called ``CallSite``
				2456	(`doxygen <http://llvm.org/doxygen/classllvm_1_1CallSite.html>`__) It is
				2457	essentially a wrapper around an ``Instruction`` pointer, with some methods that
				2458	provide functionality common to ``CallInst``\ s and ``InvokeInst``\ s.
				2459
				2460	This class has "value semantics": it should be passed by value, not by reference
				2461	and it should not be dynamically allocated or deallocated using ``operator new``
				2462	or ``operator delete``. It is efficiently copyable, assignable and
				2463	constructable, with costs equivalents to that of a bare pointer. If you look at
				2464	its definition, it has only a single pointer member.
				2465
				2466	.. _iterate_chains:
				2467
				2468	Iterating over def-use & use-def chains
				2469	^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
				2470
				2471	Frequently, we might have an instance of the ``Value`` class (`doxygen
				2472	<http://llvm.org/doxygen/classllvm_1_1Value.html>`__) and we want to determine
				2473	which ``User`` s use the ``Value``. The list of all ``User``\ s of a particular
				2474	``Value`` is called a def-use chain. For example, let's say we have a
				2475	``Function*`` named ``F`` to a particular function ``foo``. Finding all of the
				2476	instructions that use ``foo`` is as simple as iterating over the def-use
				2477	chain of ``F``:
				2478
				2479	.. code-block:: c++
				2480
				2481	Function *F = ...;
				2482
Adam Nemet	3aecd18	2015-03-17 17:51:58 +0000	[diff] [blame]	2483	for (User *U : F->users()) {
Yaron Keren	adcf88e	2014-05-01 12:33:26 +0000	[diff] [blame]	2484	if (Instruction *Inst = dyn_cast<Instruction>(U)) {
Sean Silva	beb15ca	2012-12-04 03:20:08 +0000	[diff] [blame]	2485	errs() << "F is used in instruction:\n";
				2486	errs() << *Inst << "\n";
				2487	}
				2488
Sean Silva	beb15ca	2012-12-04 03:20:08 +0000	[diff] [blame]	2489	Alternatively, it's common to have an instance of the ``User`` Class (`doxygen
				2490	<http://llvm.org/doxygen/classllvm_1_1User.html>`__) and need to know what
				2491	``Value``\ s are used by it. The list of all ``Value``\ s used by a ``User`` is
				2492	known as a use-def chain. Instances of class ``Instruction`` are common
				2493	``User`` s, so we might want to iterate over all of the values that a particular
				2494	instruction uses (that is, the operands of the particular ``Instruction``):
				2495
				2496	.. code-block:: c++
				2497
				2498	Instruction *pi = ...;
				2499
Yaron Keren	7229bbf	2014-05-02 08:26:30 +0000	[diff] [blame]	2500	for (Use &U : pi->operands()) {
Yaron Keren	adcf88e	2014-05-01 12:33:26 +0000	[diff] [blame]	2501	Value *v = U.get();
Sean Silva	beb15ca	2012-12-04 03:20:08 +0000	[diff] [blame]	2502	// ...
				2503	}
				2504
				2505	Declaring objects as ``const`` is an important tool of enforcing mutation free
				2506	algorithms (such as analyses, etc.). For this purpose above iterators come in
				2507	constant flavors as ``Value::const_use_iterator`` and
				2508	``Value::const_op_iterator``. They automatically arise when calling
				2509	``use/op_begin()`` on ``const Value``\ s or ``const User``\ s respectively.
				2510	Upon dereferencing, they return ``const Use*``\ s. Otherwise the above patterns
				2511	remain unchanged.
				2512
				2513	.. _iterate_preds:
				2514
				2515	Iterating over predecessors & successors of blocks
				2516	^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
				2517
				2518	Iterating over the predecessors and successors of a block is quite easy with the
Yaron Keren	28e28e8	2015-07-12 20:40:41 +0000	[diff] [blame]	2519	routines defined in ``"llvm/IR/CFG.h"``. Just use code like this to
Sean Silva	beb15ca	2012-12-04 03:20:08 +0000	[diff] [blame]	2520	iterate over all predecessors of BB:
				2521
				2522	.. code-block:: c++
				2523
Andrey Bokhanko	7454145	2016-09-02 11:13:35 +0000	[diff] [blame]	2524	#include "llvm/IR/CFG.h"
Sean Silva	beb15ca	2012-12-04 03:20:08 +0000	[diff] [blame]	2525	BasicBlock *BB = ...;
				2526
Duncan P. N. Exon Smith	6c99015	2014-07-21 17:06:51 +0000	[diff] [blame]	2527	for (pred_iterator PI = pred_begin(BB), E = pred_end(BB); PI != E; ++PI) {
				2528	BasicBlock Pred = PI;
Sean Silva	beb15ca	2012-12-04 03:20:08 +0000	[diff] [blame]	2529	// ...
				2530	}
				2531
Duncan P. N. Exon Smith	6c99015	2014-07-21 17:06:51 +0000	[diff] [blame]	2532	Similarly, to iterate over successors use ``succ_iterator/succ_begin/succ_end``.
Sean Silva	beb15ca	2012-12-04 03:20:08 +0000	[diff] [blame]	2533
				2534	.. _simplechanges:
				2535
				2536	Making simple changes
				2537	---------------------
				2538
				2539	There are some primitive transformation operations present in the LLVM
				2540	infrastructure that are worth knowing about. When performing transformations,
				2541	it's fairly common to manipulate the contents of basic blocks. This section
				2542	describes some of the common methods for doing so and gives example code.
				2543
				2544	.. _schanges_creating:
				2545
				2546	Creating and inserting new ``Instruction``\ s
				2547	^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
				2548
				2549	Instantiating Instructions
				2550
				2551	Creation of ``Instruction``\ s is straight-forward: simply call the constructor
				2552	for the kind of instruction to instantiate and provide the necessary parameters.
				2553	For example, an ``AllocaInst`` only requires a (const-ptr-to) ``Type``. Thus:
				2554
				2555	.. code-block:: c++
				2556
				2557	AllocaInst* ai = new AllocaInst(Type::Int32Ty);
				2558
				2559	will create an ``AllocaInst`` instance that represents the allocation of one
				2560	integer in the current stack frame, at run time. Each ``Instruction`` subclass
				2561	is likely to have varying default parameters which change the semantics of the
				2562	instruction, so refer to the `doxygen documentation for the subclass of
				2563	Instruction <http://llvm.org/doxygen/classllvm_1_1Instruction.html>`_ that
				2564	you're interested in instantiating.
				2565
				2566	Naming values
				2567
				2568	It is very useful to name the values of instructions when you're able to, as
				2569	this facilitates the debugging of your transformations. If you end up looking
				2570	at generated LLVM machine code, you definitely want to have logical names
				2571	associated with the results of instructions! By supplying a value for the
				2572	``Name`` (default) parameter of the ``Instruction`` constructor, you associate a
				2573	logical name with the result of the instruction's execution at run time. For
				2574	example, say that I'm writing a transformation that dynamically allocates space
				2575	for an integer on the stack, and that integer is going to be used as some kind
				2576	of index by some other code. To accomplish this, I place an ``AllocaInst`` at
				2577	the first point in the first ``BasicBlock`` of some ``Function``, and I'm
				2578	intending to use it within the same ``Function``. I might do:
				2579
				2580	.. code-block:: c++
				2581
				2582	AllocaInst* pa = new AllocaInst(Type::Int32Ty, 0, "indexLoc");
				2583
				2584	where ``indexLoc`` is now the logical name of the instruction's execution value,
				2585	which is a pointer to an integer on the run time stack.
				2586
				2587	Inserting instructions
				2588
Dan Liew	c6ab58f	2014-06-06 17:25:47 +0000	[diff] [blame]	2589	There are essentially three ways to insert an ``Instruction`` into an existing
Sean Silva	beb15ca	2012-12-04 03:20:08 +0000	[diff] [blame]	2590	sequence of instructions that form a ``BasicBlock``:
				2591
				2592	* Insertion into an explicit instruction list
				2593
				2594	Given a ``BasicBlock* pb``, an ``Instruction* pi`` within that ``BasicBlock``,
				2595	and a newly-created instruction we wish to insert before ``*pi``, we do the
				2596	following:
				2597
				2598	.. code-block:: c++
				2599
				2600	BasicBlock *pb = ...;
				2601	Instruction *pi = ...;
				2602	Instruction *newInst = new Instruction(...);
				2603
				2604	pb->getInstList().insert(pi, newInst); // Inserts newInst before pi in pb
				2605
				2606	Appending to the end of a ``BasicBlock`` is so common that the ``Instruction``
				2607	class and ``Instruction``-derived classes provide constructors which take a
				2608	pointer to a ``BasicBlock`` to be appended to. For example code that looked
				2609	like:
				2610
				2611	.. code-block:: c++
				2612
				2613	BasicBlock *pb = ...;
				2614	Instruction *newInst = new Instruction(...);
				2615
				2616	pb->getInstList().push_back(newInst); // Appends newInst to pb
				2617
				2618	becomes:
				2619
				2620	.. code-block:: c++
				2621
				2622	BasicBlock *pb = ...;
				2623	Instruction *newInst = new Instruction(..., pb);
				2624
				2625	which is much cleaner, especially if you are creating long instruction
				2626	streams.
				2627
				2628	* Insertion into an implicit instruction list
				2629
				2630	``Instruction`` instances that are already in ``BasicBlock``\ s are implicitly
				2631	associated with an existing instruction list: the instruction list of the
				2632	enclosing basic block. Thus, we could have accomplished the same thing as the
				2633	above code without being given a ``BasicBlock`` by doing:
				2634
				2635	.. code-block:: c++
				2636
				2637	Instruction *pi = ...;
				2638	Instruction *newInst = new Instruction(...);
				2639
				2640	pi->getParent()->getInstList().insert(pi, newInst);
				2641
				2642	In fact, this sequence of steps occurs so frequently that the ``Instruction``
				2643	class and ``Instruction``-derived classes provide constructors which take (as
				2644	a default parameter) a pointer to an ``Instruction`` which the newly-created
				2645	``Instruction`` should precede. That is, ``Instruction`` constructors are
				2646	capable of inserting the newly-created instance into the ``BasicBlock`` of a
				2647	provided instruction, immediately before that instruction. Using an
				2648	``Instruction`` constructor with a ``insertBefore`` (default) parameter, the
				2649	above code becomes:
				2650
				2651	.. code-block:: c++
				2652
				2653	Instruction* pi = ...;
				2654	Instruction* newInst = new Instruction(..., pi);
				2655
				2656	which is much cleaner, especially if you're creating a lot of instructions and
				2657	adding them to ``BasicBlock``\ s.
				2658
Dan Liew	c6ab58f	2014-06-06 17:25:47 +0000	[diff] [blame]	2659	* Insertion using an instance of ``IRBuilder``
				2660
Dan Liew	599cec6	2014-06-06 18:44:21 +0000	[diff] [blame]	2661	Inserting several ``Instruction``\ s can be quite laborious using the previous
Dan Liew	c6ab58f	2014-06-06 17:25:47 +0000	[diff] [blame]	2662	methods. The ``IRBuilder`` is a convenience class that can be used to add
				2663	several instructions to the end of a ``BasicBlock`` or before a particular
				2664	``Instruction``. It also supports constant folding and renaming named
				2665	registers (see ``IRBuilder``'s template arguments).
				2666
				2667	The example below demonstrates a very simple use of the ``IRBuilder`` where
				2668	three instructions are inserted before the instruction ``pi``. The first two
				2669	instructions are Call instructions and third instruction multiplies the return
				2670	value of the two calls.
				2671
				2672	.. code-block:: c++
				2673
				2674	Instruction *pi = ...;
				2675	IRBuilder<> Builder(pi);
				2676	CallInst* callOne = Builder.CreateCall(...);
				2677	CallInst* callTwo = Builder.CreateCall(...);
				2678	Value* result = Builder.CreateMul(callOne, callTwo);
				2679
				2680	The example below is similar to the above example except that the created
				2681	``IRBuilder`` inserts instructions at the end of the ``BasicBlock`` ``pb``.
				2682
				2683	.. code-block:: c++
				2684
				2685	BasicBlock *pb = ...;
				2686	IRBuilder<> Builder(pb);
				2687	CallInst* callOne = Builder.CreateCall(...);
				2688	CallInst* callTwo = Builder.CreateCall(...);
				2689	Value* result = Builder.CreateMul(callOne, callTwo);
				2690
Etienne Bergeron	d8b9735	2016-07-13 06:10:37 +0000	[diff] [blame]	2691	See :doc:`tutorial/LangImpl03` for a practical use of the ``IRBuilder``.
Dan Liew	c6ab58f	2014-06-06 17:25:47 +0000	[diff] [blame]	2692
				2693
Sean Silva	beb15ca	2012-12-04 03:20:08 +0000	[diff] [blame]	2694	.. _schanges_deleting:
				2695
				2696	Deleting Instructions
				2697	^^^^^^^^^^^^^^^^^^^^^
				2698
				2699	Deleting an instruction from an existing sequence of instructions that form a
				2700	BasicBlock_ is very straight-forward: just call the instruction's
				2701	``eraseFromParent()`` method. For example:
				2702
				2703	.. code-block:: c++
				2704
				2705	Instruction *I = .. ;
				2706	I->eraseFromParent();
				2707
				2708	This unlinks the instruction from its containing basic block and deletes it. If
				2709	you'd just like to unlink the instruction from its containing basic block but
				2710	not delete it, you can use the ``removeFromParent()`` method.
				2711
				2712	.. _schanges_replacing:
				2713
				2714	Replacing an Instruction with another Value
				2715	^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
				2716
				2717	Replacing individual instructions
				2718	"""""""""""""""""""""""""""""""""
				2719
				2720	Including "`llvm/Transforms/Utils/BasicBlockUtils.h
				2721	<http://llvm.org/doxygen/BasicBlockUtils_8h-source.html>`_" permits use of two
				2722	very useful replace functions: ``ReplaceInstWithValue`` and
				2723	``ReplaceInstWithInst``.
				2724
				2725	.. _schanges_deleting_sub:
				2726
				2727	Deleting Instructions
				2728	"""""""""""""""""""""
				2729
				2730	* ``ReplaceInstWithValue``
				2731
				2732	This function replaces all uses of a given instruction with a value, and then
				2733	removes the original instruction. The following example illustrates the
				2734	replacement of the result of a particular ``AllocaInst`` that allocates memory
				2735	for a single integer with a null pointer to an integer.
				2736
				2737	.. code-block:: c++
				2738
				2739	AllocaInst* instToReplace = ...;
				2740	BasicBlock::iterator ii(instToReplace);
				2741
				2742	ReplaceInstWithValue(instToReplace->getParent()->getInstList(), ii,
				2743	Constant::getNullValue(PointerType::getUnqual(Type::Int32Ty)));
				2744
				2745	* ``ReplaceInstWithInst``
				2746
				2747	This function replaces a particular instruction with another instruction,
				2748	inserting the new instruction into the basic block at the location where the
				2749	old instruction was, and replacing any uses of the old instruction with the
				2750	new instruction. The following example illustrates the replacement of one
				2751	``AllocaInst`` with another.
				2752
				2753	.. code-block:: c++
				2754
				2755	AllocaInst* instToReplace = ...;
				2756	BasicBlock::iterator ii(instToReplace);
				2757
				2758	ReplaceInstWithInst(instToReplace->getParent()->getInstList(), ii,
				2759	new AllocaInst(Type::Int32Ty, 0, "ptrToReplacedInt"));
				2760
				2761
				2762	Replacing multiple uses of Users and Values
				2763	"""""""""""""""""""""""""""""""""""""""""""
				2764
				2765	You can use ``Value::replaceAllUsesWith`` and ``User::replaceUsesOfWith`` to
				2766	change more than one use at a time. See the doxygen documentation for the
				2767	`Value Class <http://llvm.org/doxygen/classllvm_1_1Value.html>`_ and `User Class
				2768	<http://llvm.org/doxygen/classllvm_1_1User.html>`_, respectively, for more
				2769	information.
				2770
				2771	.. _schanges_deletingGV:
				2772
				2773	Deleting GlobalVariables
				2774	^^^^^^^^^^^^^^^^^^^^^^^^
				2775
				2776	Deleting a global variable from a module is just as easy as deleting an
				2777	Instruction. First, you must have a pointer to the global variable that you
				2778	wish to delete. You use this pointer to erase it from its parent, the module.
				2779	For example:
				2780
				2781	.. code-block:: c++
				2782
				2783	GlobalVariable *GV = .. ;
				2784
				2785	GV->eraseFromParent();
				2786
				2787
				2788	.. _create_types:
				2789
				2790	How to Create Types
				2791	-------------------
				2792
				2793	In generating IR, you may need some complex types. If you know these types
				2794	statically, you can use ``TypeBuilder<...>::get()``, defined in
				2795	``llvm/Support/TypeBuilder.h``, to retrieve them. ``TypeBuilder`` has two forms
				2796	depending on whether you're building types for cross-compilation or native
				2797	library use. ``TypeBuilder<T, true>`` requires that ``T`` be independent of the
				2798	host environment, meaning that it's built out of types from the ``llvm::types``
				2799	(`doxygen <http://llvm.org/doxygen/namespacellvm_1_1types.html>`__) namespace
				2800	and pointers, functions, arrays, etc. built of those. ``TypeBuilder<T, false>``
				2801	additionally allows native C types whose size may depend on the host compiler.
				2802	For example,
				2803
				2804	.. code-block:: c++
				2805
				2806	FunctionType ft = TypeBuilder<types::i<8>(types::i<32>), true>::get();
				2807
				2808	is easier to read and write than the equivalent
				2809
				2810	.. code-block:: c++
				2811
				2812	std::vector<const Type*> params;
				2813	params.push_back(PointerType::getUnqual(Type::Int32Ty));
				2814	FunctionType *ft = FunctionType::get(Type::Int8Ty, params, false);
				2815
				2816	See the `class comment
				2817	<http://llvm.org/doxygen/TypeBuilder_8h-source.html#l00001>`_ for more details.
				2818
				2819	.. _threading:
				2820
				2821	Threads and LLVM
				2822	================
				2823
				2824	This section describes the interaction of the LLVM APIs with multithreading,
				2825	both on the part of client applications, and in the JIT, in the hosted
				2826	application.
				2827
				2828	Note that LLVM's support for multithreading is still relatively young. Up
				2829	through version 2.5, the execution of threaded hosted applications was
				2830	supported, but not threaded client access to the APIs. While this use case is
				2831	now supported, clients must adhere to the guidelines specified below to ensure
				2832	proper operation in multithreaded mode.
				2833
				2834	Note that, on Unix-like platforms, LLVM requires the presence of GCC's atomic
				2835	intrinsics in order to support threaded operation. If you need a
				2836	multhreading-capable LLVM on a platform without a suitably modern system
				2837	compiler, consider compiling LLVM and LLVM-GCC in single-threaded mode, and
				2838	using the resultant compiler to build a copy of LLVM with multithreading
				2839	support.
				2840
Sean Silva	beb15ca	2012-12-04 03:20:08 +0000	[diff] [blame]	2841	.. _shutdown:
				2842
				2843	Ending Execution with ``llvm_shutdown()``
				2844	-----------------------------------------
				2845
				2846	When you are done using the LLVM APIs, you should call ``llvm_shutdown()`` to
Chandler Carruth	39cd216	2014-06-27 15:13:01 +0000	[diff] [blame]	2847	deallocate memory used for internal structures.
Zachary Turner	ccbf3d0	2014-06-16 22:49:41 +0000	[diff] [blame]	2848
Sean Silva	beb15ca	2012-12-04 03:20:08 +0000	[diff] [blame]	2849	.. _managedstatic:
				2850
				2851	Lazy Initialization with ``ManagedStatic``
				2852	------------------------------------------
				2853
				2854	``ManagedStatic`` is a utility class in LLVM used to implement static
Chandler Carruth	39cd216	2014-06-27 15:13:01 +0000	[diff] [blame]	2855	initialization of static resources, such as the global type tables. In a
				2856	single-threaded environment, it implements a simple lazy initialization scheme.
				2857	When LLVM is compiled with support for multi-threading, however, it uses
Sean Silva	beb15ca	2012-12-04 03:20:08 +0000	[diff] [blame]	2858	double-checked locking to implement thread-safe lazy initialization.
				2859
Sean Silva	beb15ca	2012-12-04 03:20:08 +0000	[diff] [blame]	2860	.. _llvmcontext:
				2861
				2862	Achieving Isolation with ``LLVMContext``
				2863	----------------------------------------
				2864
				2865	``LLVMContext`` is an opaque class in the LLVM API which clients can use to
				2866	operate multiple, isolated instances of LLVM concurrently within the same
				2867	address space. For instance, in a hypothetical compile-server, the compilation
				2868	of an individual translation unit is conceptually independent from all the
				2869	others, and it would be desirable to be able to compile incoming translation
				2870	units concurrently on independent server threads. Fortunately, ``LLVMContext``
				2871	exists to enable just this kind of scenario!
				2872
				2873	Conceptually, ``LLVMContext`` provides isolation. Every LLVM entity
				2874	(``Module``\ s, ``Value``\ s, ``Type``\ s, ``Constant``\ s, etc.) in LLVM's
				2875	in-memory IR belongs to an ``LLVMContext``. Entities in different contexts
				2876	cannot interact with each other: ``Module``\ s in different contexts cannot be
				2877	linked together, ``Function``\ s cannot be added to ``Module``\ s in different
				2878	contexts, etc. What this means is that is is safe to compile on multiple
				2879	threads simultaneously, as long as no two threads operate on entities within the
				2880	same context.
				2881
				2882	In practice, very few places in the API require the explicit specification of a
				2883	``LLVMContext``, other than the ``Type`` creation/lookup APIs. Because every
				2884	``Type`` carries a reference to its owning context, most other entities can
				2885	determine what context they belong to by looking at their own ``Type``. If you
				2886	are adding new entities to LLVM IR, please try to maintain this interface
				2887	design.
				2888
Sean Silva	beb15ca	2012-12-04 03:20:08 +0000	[diff] [blame]	2889	.. _jitthreading:
				2890
				2891	Threads and the JIT
				2892	-------------------
				2893
				2894	LLVM's "eager" JIT compiler is safe to use in threaded programs. Multiple
				2895	threads can call ``ExecutionEngine::getPointerToFunction()`` or
				2896	``ExecutionEngine::runFunction()`` concurrently, and multiple threads can run
				2897	code output by the JIT concurrently. The user must still ensure that only one
				2898	thread accesses IR in a given ``LLVMContext`` while another thread might be
				2899	modifying it. One way to do that is to always hold the JIT lock while accessing
				2900	IR outside the JIT (the JIT modifies the IR by adding ``CallbackVH``\ s).
				2901	Another way is to only call ``getPointerToFunction()`` from the
				2902	``LLVMContext``'s thread.
				2903
				2904	When the JIT is configured to compile lazily (using
				2905	``ExecutionEngine::DisableLazyCompilation(false)``), there is currently a `race
				2906	condition <http://llvm.org/bugs/show_bug.cgi?id=5184>`_ in updating call sites
				2907	after a function is lazily-jitted. It's still possible to use the lazy JIT in a
				2908	threaded program if you ensure that only one thread at a time can call any
				2909	particular lazy stub and that the JIT lock guards any IR access, but we suggest
				2910	using only the eager JIT in threaded programs.
				2911
				2912	.. _advanced:
				2913
				2914	Advanced Topics
				2915	===============
				2916
				2917	This section describes some of the advanced or obscure API's that most clients
				2918	do not need to be aware of. These API's tend manage the inner workings of the
				2919	LLVM system, and only need to be accessed in unusual circumstances.
				2920
				2921	.. _SymbolTable:
				2922
				2923	The ``ValueSymbolTable`` class
				2924	------------------------------
				2925
				2926	The ``ValueSymbolTable`` (`doxygen
				2927	<http://llvm.org/doxygen/classllvm_1_1ValueSymbolTable.html>`__) class provides
				2928	a symbol table that the :ref:`Function <c_Function>` and Module_ classes use for
				2929	naming value definitions. The symbol table can provide a name for any Value_.
				2930
				2931	Note that the ``SymbolTable`` class should not be directly accessed by most
				2932	clients. It should only be used when iteration over the symbol table names
				2933	themselves are required, which is very special purpose. Note that not all LLVM
				2934	Value_\ s have names, and those without names (i.e. they have an empty name) do
				2935	not exist in the symbol table.
				2936
				2937	Symbol tables support iteration over the values in the symbol table with
				2938	``begin/end/iterator`` and supports querying to see if a specific name is in the
				2939	symbol table (with ``lookup``). The ``ValueSymbolTable`` class exposes no
				2940	public mutator methods, instead, simply call ``setName`` on a value, which will
				2941	autoinsert it into the appropriate symbol table.
				2942
				2943	.. _UserLayout:
				2944
				2945	The ``User`` and owned ``Use`` classes' memory layout
				2946	-----------------------------------------------------
				2947
				2948	The ``User`` (`doxygen <http://llvm.org/doxygen/classllvm_1_1User.html>`__)
				2949	class provides a basis for expressing the ownership of ``User`` towards other
				2950	`Value instance <http://llvm.org/doxygen/classllvm_1_1Value.html>`_\ s. The
				2951	``Use`` (`doxygen <http://llvm.org/doxygen/classllvm_1_1Use.html>`__) helper
				2952	class is employed to do the bookkeeping and to facilitate O(1) addition and
				2953	removal.
				2954
				2955	.. _Use2User:
				2956
				2957	Interaction and relationship between ``User`` and ``Use`` objects
				2958	^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
				2959
				2960	A subclass of ``User`` can choose between incorporating its ``Use`` objects or
				2961	refer to them out-of-line by means of a pointer. A mixed variant (some ``Use``
				2962	s inline others hung off) is impractical and breaks the invariant that the
				2963	``Use`` objects belonging to the same ``User`` form a contiguous array.
				2964
				2965	We have 2 different layouts in the ``User`` (sub)classes:
				2966
				2967	* Layout a)
				2968
				2969	The ``Use`` object(s) are inside (resp. at fixed offset) of the ``User``
				2970	object and there are a fixed number of them.
				2971
				2972	* Layout b)
				2973
				2974	The ``Use`` object(s) are referenced by a pointer to an array from the
				2975	``User`` object and there may be a variable number of them.
				2976
				2977	As of v2.4 each layout still possesses a direct pointer to the start of the
				2978	array of ``Use``\ s. Though not mandatory for layout a), we stick to this
				2979	redundancy for the sake of simplicity. The ``User`` object also stores the
				2980	number of ``Use`` objects it has. (Theoretically this information can also be
				2981	calculated given the scheme presented below.)
				2982
				2983	Special forms of allocation operators (``operator new``) enforce the following
				2984	memory layouts:
				2985
				2986	* Layout a) is modelled by prepending the ``User`` object by the ``Use[]``
				2987	array.
				2988
				2989	.. code-block:: none
				2990
				2991	...---.---.---.---.-------...
				2992	\| P \| P \| P \| P \| User
				2993	'''---'---'---'---'-------'''
				2994
				2995	* Layout b) is modelled by pointing at the ``Use[]`` array.
				2996
				2997	.. code-block:: none
				2998
				2999	.-------...
				3000	\| User
				3001	'-------'''
				3002	\|
				3003	v
				3004	.---.---.---.---...
				3005	\| P \| P \| P \| P \|
				3006	'---'---'---'---'''
				3007
				3008	(In the above figures '``P``' stands for the ``Use*`` that is stored in
				3009	each* ``Use`` object in the member ``Use::Prev`` )
				3010
				3011	.. _Waymarking:
				3012
				3013	The waymarking algorithm
				3014	^^^^^^^^^^^^^^^^^^^^^^^^
				3015
				3016	Since the ``Use`` objects are deprived of the direct (back)pointer to their
				3017	``User`` objects, there must be a fast and exact method to recover it. This is
				3018	accomplished by the following scheme:
				3019
				3020	A bit-encoding in the 2 LSBits (least significant bits) of the ``Use::Prev``
				3021	allows to find the start of the ``User`` object:
				3022
Dmitri Gribenko	e813112	2013-01-19 20:34:20 +0000	[diff] [blame]	3023	* ``00`` --- binary digit 0
Sean Silva	beb15ca	2012-12-04 03:20:08 +0000	[diff] [blame]	3024
Dmitri Gribenko	e813112	2013-01-19 20:34:20 +0000	[diff] [blame]	3025	* ``01`` --- binary digit 1
Sean Silva	beb15ca	2012-12-04 03:20:08 +0000	[diff] [blame]	3026
Dmitri Gribenko	e813112	2013-01-19 20:34:20 +0000	[diff] [blame]	3027	* ``10`` --- stop and calculate (``s``)
Sean Silva	beb15ca	2012-12-04 03:20:08 +0000	[diff] [blame]	3028
Dmitri Gribenko	e813112	2013-01-19 20:34:20 +0000	[diff] [blame]	3029	* ``11`` --- full stop (``S``)
Sean Silva	beb15ca	2012-12-04 03:20:08 +0000	[diff] [blame]	3030
				3031	Given a ``Use*``, all we have to do is to walk till we get a stop and we either
				3032	have a ``User`` immediately behind or we have to walk to the next stop picking
				3033	up digits and calculating the offset:
				3034
				3035	.. code-block:: none
				3036
				3037	.---.---.---.---.---.---.---.---.---.---.---.---.---.---.---.---.----------------
				3038	\| 1 \| s \| 1 \| 0 \| 1 \| 0 \| s \| 1 \| 1 \| 0 \| s \| 1 \| 1 \| s \| 1 \| S \| User (or User*)
				3039	'---'---'---'---'---'---'---'---'---'---'---'---'---'---'---'---'----------------
				3040	\|+15 \|+10 \|+6 \|+3 \|+1
				3041	\| \| \| \| \| __>
				3042	\| \| \| \| __________>
				3043	\| \| \| ______________________>
				3044	\| \| ______________________________________>
				3045	\| __________________________________________________________>
				3046
				3047	Only the significant number of bits need to be stored between the stops, so that
				3048	the worst case is 20 memory accesses when there are 1000 ``Use`` objects
				3049	associated with a ``User``.
				3050
				3051	.. _ReferenceImpl:
				3052
				3053	Reference implementation
				3054	^^^^^^^^^^^^^^^^^^^^^^^^
				3055
				3056	The following literate Haskell fragment demonstrates the concept:
				3057
				3058	.. code-block:: haskell
				3059
				3060	> import Test.QuickCheck
				3061	>
				3062	> digits :: Int -> [Char] -> [Char]
				3063	> digits 0 acc = '0' : acc
				3064	> digits 1 acc = '1' : acc
				3065	> digits n acc = digits (n `div` 2) $ digits (n `mod` 2) acc
				3066	>
				3067	> dist :: Int -> [Char] -> [Char]
				3068	> dist 0 [] = ['S']
				3069	> dist 0 acc = acc
				3070	> dist 1 acc = let r = dist 0 acc in 's' : digits (length r) r
				3071	> dist n acc = dist (n - 1) $ dist 1 acc
				3072	>
				3073	> takeLast n ss = reverse $ take n $ reverse ss
				3074	>
				3075	> test = takeLast 40 $ dist 20 []
				3076	>
				3077
				3078	Printing <test> gives: ``"1s100000s11010s10100s1111s1010s110s11s1S"``
				3079
				3080	The reverse algorithm computes the length of the string just by examining a
				3081	certain prefix:
				3082
				3083	.. code-block:: haskell
				3084
				3085	> pref :: [Char] -> Int
				3086	> pref "S" = 1
				3087	> pref ('s':'1':rest) = decode 2 1 rest
				3088	> pref (_:rest) = 1 + pref rest
				3089	>
				3090	> decode walk acc ('0':rest) = decode (walk + 1) (acc * 2) rest
				3091	> decode walk acc ('1':rest) = decode (walk + 1) (acc * 2 + 1) rest
				3092	> decode walk acc _ = walk + acc
				3093	>
				3094
				3095	Now, as expected, printing <pref test> gives ``40``.
				3096
				3097	We can quickCheck this with following property:
				3098
				3099	.. code-block:: haskell
				3100
				3101	> testcase = dist 2000 []
				3102	> testcaseLength = length testcase
				3103	>
				3104	> identityProp n = n > 0 && n <= testcaseLength ==> length arr == pref arr
				3105	> where arr = takeLast n testcase
				3106	>
				3107
				3108	As expected <quickCheck identityProp> gives:
				3109
				3110	::
				3111
				3112	*Main> quickCheck identityProp
				3113	OK, passed 100 tests.
				3114
				3115	Let's be a bit more exhaustive:
				3116
				3117	.. code-block:: haskell
				3118
				3119	>
				3120	> deepCheck p = check (defaultConfig { configMaxTest = 500 }) p
				3121	>
				3122
				3123	And here is the result of <deepCheck identityProp>:
				3124
				3125	::
				3126
				3127	*Main> deepCheck identityProp
				3128	OK, passed 500 tests.
				3129
				3130	.. _Tagging:
				3131
				3132	Tagging considerations
				3133	^^^^^^^^^^^^^^^^^^^^^^
				3134
				3135	To maintain the invariant that the 2 LSBits of each ``Use**`` in ``Use`` never
				3136	change after being set up, setters of ``Use::Prev`` must re-tag the new
				3137	``Use**`` on every modification. Accordingly getters must strip the tag bits.
				3138
				3139	For layout b) instead of the ``User`` we find a pointer (``User*`` with LSBit
				3140	set). Following this pointer brings us to the ``User``. A portable trick
				3141	ensures that the first bytes of ``User`` (if interpreted as a pointer) never has
				3142	the LSBit set. (Portability is relying on the fact that all known compilers
				3143	place the ``vptr`` in the first word of the instances.)
				3144
Chandler Carruth	064dc33	2015-01-28 03:04:54 +0000	[diff] [blame]	3145	.. _polymorphism:
				3146
				3147	Designing Type Hiercharies and Polymorphic Interfaces
				3148	-----------------------------------------------------
				3149
				3150	There are two different design patterns that tend to result in the use of
				3151	virtual dispatch for methods in a type hierarchy in C++ programs. The first is
				3152	a genuine type hierarchy where different types in the hierarchy model
				3153	a specific subset of the functionality and semantics, and these types nest
				3154	strictly within each other. Good examples of this can be seen in the ``Value``
				3155	or ``Type`` type hierarchies.
				3156
				3157	A second is the desire to dispatch dynamically across a collection of
				3158	polymorphic interface implementations. This latter use case can be modeled with
				3159	virtual dispatch and inheritance by defining an abstract interface base class
				3160	which all implementations derive from and override. However, this
				3161	implementation strategy forces an "is-a" relationship to exist that is not
				3162	actually meaningful. There is often not some nested hierarchy of useful
				3163	generalizations which code might interact with and move up and down. Instead,
				3164	there is a singular interface which is dispatched across a range of
				3165	implementations.
				3166
				3167	The preferred implementation strategy for the second use case is that of
				3168	generic programming (sometimes called "compile-time duck typing" or "static
				3169	polymorphism"). For example, a template over some type parameter ``T`` can be
				3170	instantiated across any particular implementation that conforms to the
				3171	interface or concept. A good example here is the highly generic properties of
				3172	any type which models a node in a directed graph. LLVM models these primarily
				3173	through templates and generic programming. Such templates include the
				3174	``LoopInfoBase`` and ``DominatorTreeBase``. When this type of polymorphism
				3175	truly needs dynamic dispatch you can generalize it using a technique
				3176	called concept-based polymorphism. This pattern emulates the interfaces and
				3177	behaviors of templates using a very limited form of virtual dispatch for type
				3178	erasure inside its implementation. You can find examples of this technique in
				3179	the ``PassManager.h`` system, and there is a more detailed introduction to it
				3180	by Sean Parent in several of his talks and papers:
				3181
				3182	#. `Inheritance Is The Base Class of Evil
				3183	<http://channel9.msdn.com/Events/GoingNative/2013/Inheritance-Is-The-Base-Class-of-Evil>`_
				3184	- The GoingNative 2013 talk describing this technique, and probably the best
				3185	place to start.
				3186	#. `Value Semantics and Concepts-based Polymorphism
				3187	<http://www.youtube.com/watch?v=_BpMYeUFXv8>`_ - The C++Now! 2012 talk
				3188	describing this technique in more detail.
				3189	#. `Sean Parent's Papers and Presentations
				3190	<http://github.com/sean-parent/sean-parent.github.com/wiki/Papers-and-Presentations>`_
				3191	- A Github project full of links to slides, video, and sometimes code.
				3192
				3193	When deciding between creating a type hierarchy (with either tagged or virtual
				3194	dispatch) and using templates or concepts-based polymorphism, consider whether
				3195	there is some refinement of an abstract base class which is a semantically
				3196	meaningful type on an interface boundary. If anything more refined than the
				3197	root abstract interface is meaningless to talk about as a partial extension of
				3198	the semantic model, then your use case likely fits better with polymorphism and
				3199	you should avoid using virtual dispatch. However, there may be some exigent
				3200	circumstances that require one technique or the other to be used.
				3201
				3202	If you do need to introduce a type hierarchy, we prefer to use explicitly
				3203	closed type hierarchies with manual tagged dispatch and/or RTTI rather than the
				3204	open inheritance model and virtual dispatch that is more common in C++ code.
				3205	This is because LLVM rarely encourages library consumers to extend its core
				3206	types, and leverages the closed and tag-dispatched nature of its hierarchies to
				3207	generate significantly more efficient code. We have also found that a large
				3208	amount of our usage of type hierarchies fits better with tag-based pattern
				3209	matching rather than dynamic dispatch across a common interface. Within LLVM we
				3210	have built custom helpers to facilitate this design. See this document's
Sean Silva	52c7dcd	2015-01-28 10:36:41 +0000	[diff] [blame]	3211	section on :ref:`isa and dyn_cast <isa>` and our :doc:`detailed document
				3212	<HowToSetUpLLVMStyleRTTI>` which describes how you can implement this
				3213	pattern for use with the LLVM helpers.
Chandler Carruth	064dc33	2015-01-28 03:04:54 +0000	[diff] [blame]	3214
Sanjoy Das	8ce6499	2015-03-26 19:25:01 +0000	[diff] [blame]	3215	.. _abi_breaking_checks:
				3216
				3217	ABI Breaking Checks
				3218	-------------------
				3219
				3220	Checks and asserts that alter the LLVM C++ ABI are predicated on the
				3221	preprocessor symbol `LLVM_ENABLE_ABI_BREAKING_CHECKS` -- LLVM
				3222	libraries built with `LLVM_ENABLE_ABI_BREAKING_CHECKS` are not ABI
				3223	compatible LLVM libraries built without it defined. By default,
				3224	turning on assertions also turns on `LLVM_ENABLE_ABI_BREAKING_CHECKS`
				3225	so a default +Asserts build is not ABI compatible with a
				3226	default -Asserts build. Clients that want ABI compatibility
				3227	between +Asserts and -Asserts builds should use the CMake or autoconf
				3228	build systems to set `LLVM_ENABLE_ABI_BREAKING_CHECKS` independently
				3229	of `LLVM_ENABLE_ASSERTIONS`.
				3230
Sean Silva	beb15ca	2012-12-04 03:20:08 +0000	[diff] [blame]	3231	.. _coreclasses:
				3232
				3233	The Core LLVM Class Hierarchy Reference
				3234	=======================================
				3235
Benjamin Kramer	9f566a5	2013-07-08 19:59:35 +0000	[diff] [blame]	3236	``#include "llvm/IR/Type.h"``
Sean Silva	beb15ca	2012-12-04 03:20:08 +0000	[diff] [blame]	3237
				3238	header source: `Type.h <http://llvm.org/doxygen/Type_8h-source.html>`_
				3239
				3240	doxygen info: `Type Clases <http://llvm.org/doxygen/classllvm_1_1Type.html>`_
				3241
				3242	The Core LLVM classes are the primary means of representing the program being
				3243	inspected or transformed. The core LLVM classes are defined in header files in
Charlie Turner	2ac115e	2015-04-16 17:01:23 +0000	[diff] [blame]	3244	the ``include/llvm/IR`` directory, and implemented in the ``lib/IR``
				3245	directory. It's worth noting that, for historical reasons, this library is
				3246	called ``libLLVMCore.so``, not ``libLLVMIR.so`` as you might expect.
Sean Silva	beb15ca	2012-12-04 03:20:08 +0000	[diff] [blame]	3247
				3248	.. _Type:
				3249
				3250	The Type class and Derived Types
				3251	--------------------------------
				3252
				3253	``Type`` is a superclass of all type classes. Every ``Value`` has a ``Type``.
				3254	``Type`` cannot be instantiated directly but only through its subclasses.
				3255	Certain primitive types (``VoidType``, ``LabelType``, ``FloatType`` and
				3256	``DoubleType``) have hidden subclasses. They are hidden because they offer no
				3257	useful functionality beyond what the ``Type`` class offers except to distinguish
				3258	themselves from other subclasses of ``Type``.
				3259
				3260	All other types are subclasses of ``DerivedType``. Types can be named, but this
				3261	is not a requirement. There exists exactly one instance of a given shape at any
				3262	one time. This allows type equality to be performed with address equality of
				3263	the Type Instance. That is, given two ``Type*`` values, the types are identical
				3264	if the pointers are identical.
				3265
				3266	.. _m_Type:
				3267
				3268	Important Public Methods
				3269	^^^^^^^^^^^^^^^^^^^^^^^^
				3270
				3271	* ``bool isIntegerTy() const``: Returns true for any integer type.
				3272
				3273	* ``bool isFloatingPointTy()``: Return true if this is one of the five
				3274	floating point types.
				3275
				3276	* ``bool isSized()``: Return true if the type has known size. Things
				3277	that don't have a size are abstract types, labels and void.
				3278
				3279	.. _derivedtypes:
				3280
				3281	Important Derived Types
				3282	^^^^^^^^^^^^^^^^^^^^^^^
				3283
				3284	``IntegerType``
				3285	Subclass of DerivedType that represents integer types of any bit width. Any
				3286	bit width between ``IntegerType::MIN_INT_BITS`` (1) and
				3287	``IntegerType::MAX_INT_BITS`` (~8 million) can be represented.
				3288
				3289	* ``static const IntegerType* get(unsigned NumBits)``: get an integer
				3290	type of a specific bit width.
				3291
				3292	* ``unsigned getBitWidth() const``: Get the bit width of an integer type.
				3293
				3294	``SequentialType``
Peter Collingbourne	4568158	2016-12-02 03:05:41 +0000	[diff] [blame]	3295	This is subclassed by ArrayType and VectorType.
Sean Silva	beb15ca	2012-12-04 03:20:08 +0000	[diff] [blame]	3296
				3297	* ``const Type * getElementType() const``: Returns the type of each
				3298	of the elements in the sequential type.
				3299
Peter Collingbourne	bc07052	2016-12-02 03:20:58 +0000	[diff] [blame]	3300	* ``uint64_t getNumElements() const``: Returns the number of elements
				3301	in the sequential type.
				3302
Sean Silva	beb15ca	2012-12-04 03:20:08 +0000	[diff] [blame]	3303	``ArrayType``
				3304	This is a subclass of SequentialType and defines the interface for array
				3305	types.
				3306
Sean Silva	beb15ca	2012-12-04 03:20:08 +0000	[diff] [blame]	3307	``PointerType``
Peter Collingbourne	4568158	2016-12-02 03:05:41 +0000	[diff] [blame]	3308	Subclass of Type for pointer types.
Sean Silva	beb15ca	2012-12-04 03:20:08 +0000	[diff] [blame]	3309
				3310	``VectorType``
				3311	Subclass of SequentialType for vector types. A vector type is similar to an
				3312	ArrayType but is distinguished because it is a first class type whereas
				3313	ArrayType is not. Vector types are used for vector operations and are usually
Ed Maste	8ed40ce	2015-04-14 20:52:58 +0000	[diff] [blame]	3314	small vectors of an integer or floating point type.
Sean Silva	beb15ca	2012-12-04 03:20:08 +0000	[diff] [blame]	3315
				3316	``StructType``
				3317	Subclass of DerivedTypes for struct types.
				3318
				3319	.. _FunctionType:
				3320
				3321	``FunctionType``
				3322	Subclass of DerivedTypes for function types.
				3323
				3324	* ``bool isVarArg() const``: Returns true if it's a vararg function.
				3325
				3326	* ``const Type * getReturnType() const``: Returns the return type of the
				3327	function.
				3328
				3329	* ``const Type * getParamType (unsigned i)``: Returns the type of the ith
				3330	parameter.
				3331
				3332	* ``const unsigned getNumParams() const``: Returns the number of formal
				3333	parameters.
				3334
				3335	.. _Module:
				3336
				3337	The ``Module`` class
				3338	--------------------
				3339
Benjamin Kramer	9f566a5	2013-07-08 19:59:35 +0000	[diff] [blame]	3340	``#include "llvm/IR/Module.h"``
Sean Silva	beb15ca	2012-12-04 03:20:08 +0000	[diff] [blame]	3341
				3342	header source: `Module.h <http://llvm.org/doxygen/Module_8h-source.html>`_
				3343
				3344	doxygen info: `Module Class <http://llvm.org/doxygen/classllvm_1_1Module.html>`_
				3345
				3346	The ``Module`` class represents the top level structure present in LLVM
				3347	programs. An LLVM module is effectively either a translation unit of the
				3348	original program or a combination of several translation units merged by the
				3349	linker. The ``Module`` class keeps track of a list of :ref:`Function
				3350	<c_Function>`\ s, a list of GlobalVariable_\ s, and a SymbolTable_.
				3351	Additionally, it contains a few helpful member functions that try to make common
				3352	operations easy.
				3353
				3354	.. _m_Module:
				3355
				3356	Important Public Members of the ``Module`` class
				3357	^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
				3358
				3359	* ``Module::Module(std::string name = "")``
				3360
				3361	Constructing a Module_ is easy. You can optionally provide a name for it
				3362	(probably based on the name of the translation unit).
				3363
				3364	* \| ``Module::iterator`` - Typedef for function list iterator
				3365	\| ``Module::const_iterator`` - Typedef for const_iterator.
				3366	\| ``begin()``, ``end()``, ``size()``, ``empty()``
				3367
				3368	These are forwarding methods that make it easy to access the contents of a
				3369	``Module`` object's :ref:`Function <c_Function>` list.
				3370
				3371	* ``Module::FunctionListType &getFunctionList()``
				3372
				3373	Returns the list of :ref:`Function <c_Function>`\ s. This is necessary to use
				3374	when you need to update the list or perform a complex action that doesn't have
				3375	a forwarding method.
				3376
				3377	----------------
				3378
				3379	* \| ``Module::global_iterator`` - Typedef for global variable list iterator
				3380	\| ``Module::const_global_iterator`` - Typedef for const_iterator.
				3381	\| ``global_begin()``, ``global_end()``, ``global_size()``, ``global_empty()``
				3382
				3383	These are forwarding methods that make it easy to access the contents of a
				3384	``Module`` object's GlobalVariable_ list.
				3385
				3386	* ``Module::GlobalListType &getGlobalList()``
				3387
				3388	Returns the list of GlobalVariable_\ s. This is necessary to use when you
				3389	need to update the list or perform a complex action that doesn't have a
				3390	forwarding method.
				3391
				3392	----------------
				3393
				3394	* ``SymbolTable *getSymbolTable()``
				3395
				3396	Return a reference to the SymbolTable_ for this ``Module``.
				3397
				3398	----------------
				3399
				3400	* ``Function *getFunction(StringRef Name) const``
				3401
				3402	Look up the specified function in the ``Module`` SymbolTable_. If it does not
				3403	exist, return ``null``.
				3404
				3405	* ``Function *getOrInsertFunction(const std::string &Name, const FunctionType
				3406	*T)``
				3407
				3408	Look up the specified function in the ``Module`` SymbolTable_. If it does not
				3409	exist, add an external declaration for the function and return it.
				3410
				3411	* ``std::string getTypeName(const Type *Ty)``
				3412
				3413	If there is at least one entry in the SymbolTable_ for the specified Type_,
				3414	return it. Otherwise return the empty string.
				3415
				3416	* ``bool addTypeName(const std::string &Name, const Type *Ty)``
				3417
				3418	Insert an entry in the SymbolTable_ mapping ``Name`` to ``Ty``. If there is
				3419	already an entry for this name, true is returned and the SymbolTable_ is not
				3420	modified.
				3421
				3422	.. _Value:
				3423
				3424	The ``Value`` class
				3425	-------------------
				3426
Benjamin Kramer	9f566a5	2013-07-08 19:59:35 +0000	[diff] [blame]	3427	``#include "llvm/IR/Value.h"``
Sean Silva	beb15ca	2012-12-04 03:20:08 +0000	[diff] [blame]	3428
				3429	header source: `Value.h <http://llvm.org/doxygen/Value_8h-source.html>`_
				3430
				3431	doxygen info: `Value Class <http://llvm.org/doxygen/classllvm_1_1Value.html>`_
				3432
				3433	The ``Value`` class is the most important class in the LLVM Source base. It
				3434	represents a typed value that may be used (among other things) as an operand to
				3435	an instruction. There are many different types of ``Value``\ s, such as
				3436	Constant_\ s, Argument_\ s. Even Instruction_\ s and :ref:`Function
				3437	<c_Function>`\ s are ``Value``\ s.
				3438
				3439	A particular ``Value`` may be used many times in the LLVM representation for a
				3440	program. For example, an incoming argument to a function (represented with an
				3441	instance of the Argument_ class) is "used" by every instruction in the function
				3442	that references the argument. To keep track of this relationship, the ``Value``
				3443	class keeps a list of all of the ``User``\ s that is using it (the User_ class
				3444	is a base class for all nodes in the LLVM graph that can refer to ``Value``\ s).
				3445	This use list is how LLVM represents def-use information in the program, and is
				3446	accessible through the ``use_*`` methods, shown below.
				3447
				3448	Because LLVM is a typed representation, every LLVM ``Value`` is typed, and this
				3449	Type_ is available through the ``getType()`` method. In addition, all LLVM
				3450	values can be named. The "name" of the ``Value`` is a symbolic string printed
				3451	in the LLVM code:
				3452
				3453	.. code-block:: llvm
				3454
				3455	%foo = add i32 1, 2
				3456
				3457	.. _nameWarning:
				3458
				3459	The name of this instruction is "foo". NOTE that the name of any value may
				3460	be missing (an empty string), so names should ONLY be used for debugging
				3461	(making the source code easier to read, debugging printouts), they should not be
				3462	used to keep track of values or map between them. For this purpose, use a
				3463	``std::map`` of pointers to the ``Value`` itself instead.
				3464
				3465	One important aspect of LLVM is that there is no distinction between an SSA
				3466	variable and the operation that produces it. Because of this, any reference to
				3467	the value produced by an instruction (or the value available as an incoming
				3468	argument, for example) is represented as a direct pointer to the instance of the
				3469	class that represents this value. Although this may take some getting used to,
				3470	it simplifies the representation and makes it easier to manipulate.
				3471
				3472	.. _m_Value:
				3473
				3474	Important Public Members of the ``Value`` class
				3475	^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
				3476
				3477	* \| ``Value::use_iterator`` - Typedef for iterator over the use-list
				3478	\| ``Value::const_use_iterator`` - Typedef for const_iterator over the
				3479	use-list
				3480	\| ``unsigned use_size()`` - Returns the number of users of the value.
				3481	\| ``bool use_empty()`` - Returns true if there are no users.
				3482	\| ``use_iterator use_begin()`` - Get an iterator to the start of the
				3483	use-list.
				3484	\| ``use_iterator use_end()`` - Get an iterator to the end of the use-list.
				3485	\| ``User *use_back()`` - Returns the last element in the list.
				3486
				3487	These methods are the interface to access the def-use information in LLVM.
				3488	As with all other iterators in LLVM, the naming conventions follow the
				3489	conventions defined by the STL_.
				3490
				3491	* ``Type *getType() const``
				3492	This method returns the Type of the Value.
				3493
				3494	* \| ``bool hasName() const``
				3495	\| ``std::string getName() const``
				3496	\| ``void setName(const std::string &Name)``
				3497
				3498	This family of methods is used to access and assign a name to a ``Value``, be
				3499	aware of the :ref:`precaution above <nameWarning>`.
				3500
				3501	* ``void replaceAllUsesWith(Value *V)``
				3502
				3503	This method traverses the use list of a ``Value`` changing all User_\ s of the
				3504	current value to refer to "``V``" instead. For example, if you detect that an
				3505	instruction always produces a constant value (for example through constant
				3506	folding), you can replace all uses of the instruction with the constant like
				3507	this:
				3508
				3509	.. code-block:: c++
				3510
				3511	Inst->replaceAllUsesWith(ConstVal);
				3512
				3513	.. _User:
				3514
				3515	The ``User`` class
				3516	------------------
				3517
Benjamin Kramer	9f566a5	2013-07-08 19:59:35 +0000	[diff] [blame]	3518	``#include "llvm/IR/User.h"``
Sean Silva	beb15ca	2012-12-04 03:20:08 +0000	[diff] [blame]	3519
				3520	header source: `User.h <http://llvm.org/doxygen/User_8h-source.html>`_
				3521
				3522	doxygen info: `User Class <http://llvm.org/doxygen/classllvm_1_1User.html>`_
				3523
				3524	Superclass: Value_
				3525
				3526	The ``User`` class is the common base class of all LLVM nodes that may refer to
				3527	``Value``\ s. It exposes a list of "Operands" that are all of the ``Value``\ s
				3528	that the User is referring to. The ``User`` class itself is a subclass of
				3529	``Value``.
				3530
				3531	The operands of a ``User`` point directly to the LLVM ``Value`` that it refers
				3532	to. Because LLVM uses Static Single Assignment (SSA) form, there can only be
				3533	one definition referred to, allowing this direct connection. This connection
				3534	provides the use-def information in LLVM.
				3535
				3536	.. _m_User:
				3537
				3538	Important Public Members of the ``User`` class
				3539	^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
				3540
				3541	The ``User`` class exposes the operand list in two ways: through an index access
				3542	interface and through an iterator based interface.
				3543
				3544	* \| ``Value *getOperand(unsigned i)``
				3545	\| ``unsigned getNumOperands()``
				3546
				3547	These two methods expose the operands of the ``User`` in a convenient form for
				3548	direct access.
				3549
				3550	* \| ``User::op_iterator`` - Typedef for iterator over the operand list
				3551	\| ``op_iterator op_begin()`` - Get an iterator to the start of the operand
				3552	list.
				3553	\| ``op_iterator op_end()`` - Get an iterator to the end of the operand list.
				3554
				3555	Together, these methods make up the iterator based interface to the operands
				3556	of a ``User``.
				3557
				3558
				3559	.. _Instruction:
				3560
				3561	The ``Instruction`` class
				3562	-------------------------
				3563
Benjamin Kramer	9f566a5	2013-07-08 19:59:35 +0000	[diff] [blame]	3564	``#include "llvm/IR/Instruction.h"``
Sean Silva	beb15ca	2012-12-04 03:20:08 +0000	[diff] [blame]	3565
				3566	header source: `Instruction.h
				3567	<http://llvm.org/doxygen/Instruction_8h-source.html>`_
				3568
				3569	doxygen info: `Instruction Class
				3570	<http://llvm.org/doxygen/classllvm_1_1Instruction.html>`_
				3571
				3572	Superclasses: User_, Value_
				3573
				3574	The ``Instruction`` class is the common base class for all LLVM instructions.
				3575	It provides only a few methods, but is a very commonly used class. The primary
				3576	data tracked by the ``Instruction`` class itself is the opcode (instruction
				3577	type) and the parent BasicBlock_ the ``Instruction`` is embedded into. To
				3578	represent a specific type of instruction, one of many subclasses of
				3579	``Instruction`` are used.
				3580
				3581	Because the ``Instruction`` class subclasses the User_ class, its operands can
				3582	be accessed in the same way as for other ``User``\ s (with the
				3583	``getOperand()``/``getNumOperands()`` and ``op_begin()``/``op_end()`` methods).
				3584	An important file for the ``Instruction`` class is the ``llvm/Instruction.def``
				3585	file. This file contains some meta-data about the various different types of
				3586	instructions in LLVM. It describes the enum values that are used as opcodes
				3587	(for example ``Instruction::Add`` and ``Instruction::ICmp``), as well as the
				3588	concrete sub-classes of ``Instruction`` that implement the instruction (for
				3589	example BinaryOperator_ and CmpInst_). Unfortunately, the use of macros in this
				3590	file confuses doxygen, so these enum values don't show up correctly in the
				3591	`doxygen output <http://llvm.org/doxygen/classllvm_1_1Instruction.html>`_.
				3592
				3593	.. _s_Instruction:
				3594
				3595	Important Subclasses of the ``Instruction`` class
				3596	^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
				3597
				3598	.. _BinaryOperator:
				3599
				3600	* ``BinaryOperator``
				3601
				3602	This subclasses represents all two operand instructions whose operands must be
				3603	the same type, except for the comparison instructions.
				3604
				3605	.. _CastInst:
				3606
				3607	* ``CastInst``
				3608	This subclass is the parent of the 12 casting instructions. It provides
				3609	common operations on cast instructions.
				3610
				3611	.. _CmpInst:
				3612
				3613	* ``CmpInst``
				3614
				3615	This subclass respresents the two comparison instructions,
				3616	`ICmpInst <LangRef.html#i_icmp>`_ (integer opreands), and
				3617	`FCmpInst <LangRef.html#i_fcmp>`_ (floating point operands).
				3618
				3619	.. _TerminatorInst:
				3620
				3621	* ``TerminatorInst``
				3622
				3623	This subclass is the parent of all terminator instructions (those which can
				3624	terminate a block).
				3625
				3626	.. _m_Instruction:
				3627
				3628	Important Public Members of the ``Instruction`` class
				3629	^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
				3630
				3631	* ``BasicBlock *getParent()``
				3632
				3633	Returns the BasicBlock_ that this
				3634	``Instruction`` is embedded into.
				3635
				3636	* ``bool mayWriteToMemory()``
				3637
				3638	Returns true if the instruction writes to memory, i.e. it is a ``call``,
				3639	``free``, ``invoke``, or ``store``.
				3640
				3641	* ``unsigned getOpcode()``
				3642
				3643	Returns the opcode for the ``Instruction``.
				3644
				3645	* ``Instruction *clone() const``
				3646
				3647	Returns another instance of the specified instruction, identical in all ways
				3648	to the original except that the instruction has no parent (i.e. it's not
				3649	embedded into a BasicBlock_), and it has no name.
				3650
				3651	.. _Constant:
				3652
				3653	The ``Constant`` class and subclasses
				3654	-------------------------------------
				3655
				3656	Constant represents a base class for different types of constants. It is
				3657	subclassed by ConstantInt, ConstantArray, etc. for representing the various
				3658	types of Constants. GlobalValue_ is also a subclass, which represents the
				3659	address of a global variable or function.
				3660
				3661	.. _s_Constant:
				3662
				3663	Important Subclasses of Constant
				3664	^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
				3665
				3666	* ConstantInt : This subclass of Constant represents an integer constant of
				3667	any width.
				3668
				3669	* ``const APInt& getValue() const``: Returns the underlying
				3670	value of this constant, an APInt value.
				3671
				3672	* ``int64_t getSExtValue() const``: Converts the underlying APInt value to an
				3673	int64_t via sign extension. If the value (not the bit width) of the APInt
				3674	is too large to fit in an int64_t, an assertion will result. For this
				3675	reason, use of this method is discouraged.
				3676
				3677	* ``uint64_t getZExtValue() const``: Converts the underlying APInt value
				3678	to a uint64_t via zero extension. IF the value (not the bit width) of the
				3679	APInt is too large to fit in a uint64_t, an assertion will result. For this
				3680	reason, use of this method is discouraged.
				3681
				3682	* ``static ConstantInt* get(const APInt& Val)``: Returns the ConstantInt
				3683	object that represents the value provided by ``Val``. The type is implied
				3684	as the IntegerType that corresponds to the bit width of ``Val``.
				3685
				3686	* ``static ConstantInt* get(const Type *Ty, uint64_t Val)``: Returns the
				3687	ConstantInt object that represents the value provided by ``Val`` for integer
				3688	type ``Ty``.
				3689
				3690	* ConstantFP : This class represents a floating point constant.
				3691
				3692	* ``double getValue() const``: Returns the underlying value of this constant.
				3693
				3694	* ConstantArray : This represents a constant array.
				3695
				3696	* ``const std::vector<Use> &getValues() const``: Returns a vector of
				3697	component constants that makeup this array.
				3698
				3699	* ConstantStruct : This represents a constant struct.
				3700
				3701	* ``const std::vector<Use> &getValues() const``: Returns a vector of
				3702	component constants that makeup this array.
				3703
				3704	* GlobalValue : This represents either a global variable or a function. In
				3705	either case, the value is a constant fixed address (after linking).
				3706
				3707	.. _GlobalValue:
				3708
				3709	The ``GlobalValue`` class
				3710	-------------------------
				3711
Benjamin Kramer	9f566a5	2013-07-08 19:59:35 +0000	[diff] [blame]	3712	``#include "llvm/IR/GlobalValue.h"``
Sean Silva	beb15ca	2012-12-04 03:20:08 +0000	[diff] [blame]	3713
				3714	header source: `GlobalValue.h
				3715	<http://llvm.org/doxygen/GlobalValue_8h-source.html>`_
				3716
				3717	doxygen info: `GlobalValue Class
				3718	<http://llvm.org/doxygen/classllvm_1_1GlobalValue.html>`_
				3719
				3720	Superclasses: Constant_, User_, Value_
				3721
				3722	Global values ( GlobalVariable_\ s or :ref:`Function <c_Function>`\ s) are the
				3723	only LLVM values that are visible in the bodies of all :ref:`Function
				3724	<c_Function>`\ s. Because they are visible at global scope, they are also
				3725	subject to linking with other globals defined in different translation units.
				3726	To control the linking process, ``GlobalValue``\ s know their linkage rules.
				3727	Specifically, ``GlobalValue``\ s know whether they have internal or external
				3728	linkage, as defined by the ``LinkageTypes`` enumeration.
				3729
				3730	If a ``GlobalValue`` has internal linkage (equivalent to being ``static`` in C),
				3731	it is not visible to code outside the current translation unit, and does not
				3732	participate in linking. If it has external linkage, it is visible to external
				3733	code, and does participate in linking. In addition to linkage information,
				3734	``GlobalValue``\ s keep track of which Module_ they are currently part of.
				3735
				3736	Because ``GlobalValue``\ s are memory objects, they are always referred to by
				3737	their address. As such, the Type_ of a global is always a pointer to its
				3738	contents. It is important to remember this when using the ``GetElementPtrInst``
				3739	instruction because this pointer must be dereferenced first. For example, if
				3740	you have a ``GlobalVariable`` (a subclass of ``GlobalValue)`` that is an array
				3741	of 24 ints, type ``[24 x i32]``, then the ``GlobalVariable`` is a pointer to
				3742	that array. Although the address of the first element of this array and the
				3743	value of the ``GlobalVariable`` are the same, they have different types. The
				3744	``GlobalVariable``'s type is ``[24 x i32]``. The first element's type is
				3745	``i32.`` Because of this, accessing a global value requires you to dereference
				3746	the pointer with ``GetElementPtrInst`` first, then its elements can be accessed.
				3747	This is explained in the `LLVM Language Reference Manual
				3748	<LangRef.html#globalvars>`_.
				3749
				3750	.. _m_GlobalValue:
				3751
				3752	Important Public Members of the ``GlobalValue`` class
				3753	^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
				3754
				3755	* \| ``bool hasInternalLinkage() const``
				3756	\| ``bool hasExternalLinkage() const``
				3757	\| ``void setInternalLinkage(bool HasInternalLinkage)``
				3758
				3759	These methods manipulate the linkage characteristics of the ``GlobalValue``.
				3760
				3761	* ``Module *getParent()``
				3762
				3763	This returns the Module_ that the
				3764	GlobalValue is currently embedded into.
				3765
				3766	.. _c_Function:
				3767
				3768	The ``Function`` class
				3769	----------------------
				3770
Benjamin Kramer	9f566a5	2013-07-08 19:59:35 +0000	[diff] [blame]	3771	``#include "llvm/IR/Function.h"``
Sean Silva	beb15ca	2012-12-04 03:20:08 +0000	[diff] [blame]	3772
				3773	header source: `Function.h <http://llvm.org/doxygen/Function_8h-source.html>`_
				3774
				3775	doxygen info: `Function Class
				3776	<http://llvm.org/doxygen/classllvm_1_1Function.html>`_
				3777
				3778	Superclasses: GlobalValue_, Constant_, User_, Value_
				3779
				3780	The ``Function`` class represents a single procedure in LLVM. It is actually
				3781	one of the more complex classes in the LLVM hierarchy because it must keep track
				3782	of a large amount of data. The ``Function`` class keeps track of a list of
				3783	BasicBlock_\ s, a list of formal Argument_\ s, and a SymbolTable_.
				3784
				3785	The list of BasicBlock_\ s is the most commonly used part of ``Function``
				3786	objects. The list imposes an implicit ordering of the blocks in the function,
				3787	which indicate how the code will be laid out by the backend. Additionally, the
				3788	first BasicBlock_ is the implicit entry node for the ``Function``. It is not
				3789	legal in LLVM to explicitly branch to this initial block. There are no implicit
				3790	exit nodes, and in fact there may be multiple exit nodes from a single
				3791	``Function``. If the BasicBlock_ list is empty, this indicates that the
				3792	``Function`` is actually a function declaration: the actual body of the function
				3793	hasn't been linked in yet.
				3794
				3795	In addition to a list of BasicBlock_\ s, the ``Function`` class also keeps track
				3796	of the list of formal Argument_\ s that the function receives. This container
				3797	manages the lifetime of the Argument_ nodes, just like the BasicBlock_ list does
				3798	for the BasicBlock_\ s.
				3799
				3800	The SymbolTable_ is a very rarely used LLVM feature that is only used when you
				3801	have to look up a value by name. Aside from that, the SymbolTable_ is used
				3802	internally to make sure that there are not conflicts between the names of
				3803	Instruction_\ s, BasicBlock_\ s, or Argument_\ s in the function body.
				3804
				3805	Note that ``Function`` is a GlobalValue_ and therefore also a Constant_. The
				3806	value of the function is its address (after linking) which is guaranteed to be
				3807	constant.
				3808
				3809	.. _m_Function:
				3810
				3811	Important Public Members of the ``Function``
				3812	^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
				3813
				3814	* ``Function(const FunctionType *Ty, LinkageTypes Linkage,
				3815	const std::string &N = "", Module* Parent = 0)``
				3816
				3817	Constructor used when you need to create new ``Function``\ s to add the
				3818	program. The constructor must specify the type of the function to create and
				3819	what type of linkage the function should have. The FunctionType_ argument
				3820	specifies the formal arguments and return value for the function. The same
				3821	FunctionType_ value can be used to create multiple functions. The ``Parent``
				3822	argument specifies the Module in which the function is defined. If this
				3823	argument is provided, the function will automatically be inserted into that
				3824	module's list of functions.
				3825
				3826	* ``bool isDeclaration()``
				3827
				3828	Return whether or not the ``Function`` has a body defined. If the function is
				3829	"external", it does not have a body, and thus must be resolved by linking with
				3830	a function defined in a different translation unit.
				3831
				3832	* \| ``Function::iterator`` - Typedef for basic block list iterator
				3833	\| ``Function::const_iterator`` - Typedef for const_iterator.
				3834	\| ``begin()``, ``end()``, ``size()``, ``empty()``
				3835
				3836	These are forwarding methods that make it easy to access the contents of a
				3837	``Function`` object's BasicBlock_ list.
				3838
				3839	* ``Function::BasicBlockListType &getBasicBlockList()``
				3840
				3841	Returns the list of BasicBlock_\ s. This is necessary to use when you need to
				3842	update the list or perform a complex action that doesn't have a forwarding
				3843	method.
				3844
				3845	* \| ``Function::arg_iterator`` - Typedef for the argument list iterator
				3846	\| ``Function::const_arg_iterator`` - Typedef for const_iterator.
				3847	\| ``arg_begin()``, ``arg_end()``, ``arg_size()``, ``arg_empty()``
				3848
				3849	These are forwarding methods that make it easy to access the contents of a
				3850	``Function`` object's Argument_ list.
				3851
				3852	* ``Function::ArgumentListType &getArgumentList()``
				3853
				3854	Returns the list of Argument_. This is necessary to use when you need to
				3855	update the list or perform a complex action that doesn't have a forwarding
				3856	method.
				3857
				3858	* ``BasicBlock &getEntryBlock()``
				3859
				3860	Returns the entry ``BasicBlock`` for the function. Because the entry block
				3861	for the function is always the first block, this returns the first block of
				3862	the ``Function``.
				3863
				3864	* \| ``Type *getReturnType()``
				3865	\| ``FunctionType *getFunctionType()``
				3866
				3867	This traverses the Type_ of the ``Function`` and returns the return type of
				3868	the function, or the FunctionType_ of the actual function.
				3869
				3870	* ``SymbolTable *getSymbolTable()``
				3871
				3872	Return a pointer to the SymbolTable_ for this ``Function``.
				3873
				3874	.. _GlobalVariable:
				3875
				3876	The ``GlobalVariable`` class
				3877	----------------------------
				3878
Benjamin Kramer	9f566a5	2013-07-08 19:59:35 +0000	[diff] [blame]	3879	``#include "llvm/IR/GlobalVariable.h"``
Sean Silva	beb15ca	2012-12-04 03:20:08 +0000	[diff] [blame]	3880
				3881	header source: `GlobalVariable.h
				3882	<http://llvm.org/doxygen/GlobalVariable_8h-source.html>`_
				3883
				3884	doxygen info: `GlobalVariable Class
				3885	<http://llvm.org/doxygen/classllvm_1_1GlobalVariable.html>`_
				3886
				3887	Superclasses: GlobalValue_, Constant_, User_, Value_
				3888
				3889	Global variables are represented with the (surprise surprise) ``GlobalVariable``
				3890	class. Like functions, ``GlobalVariable``\ s are also subclasses of
				3891	GlobalValue_, and as such are always referenced by their address (global values
				3892	must live in memory, so their "name" refers to their constant address). See
				3893	GlobalValue_ for more on this. Global variables may have an initial value
				3894	(which must be a Constant_), and if they have an initializer, they may be marked
				3895	as "constant" themselves (indicating that their contents never change at
				3896	runtime).
				3897
				3898	.. _m_GlobalVariable:
				3899
				3900	Important Public Members of the ``GlobalVariable`` class
				3901	^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
				3902
				3903	* ``GlobalVariable(const Type *Ty, bool isConstant, LinkageTypes &Linkage,
				3904	Constant Initializer = 0, const std::string &Name = "", Module Parent = 0)``
				3905
				3906	Create a new global variable of the specified type. If ``isConstant`` is true
				3907	then the global variable will be marked as unchanging for the program. The
				3908	Linkage parameter specifies the type of linkage (internal, external, weak,
				3909	linkonce, appending) for the variable. If the linkage is InternalLinkage,
				3910	WeakAnyLinkage, WeakODRLinkage, LinkOnceAnyLinkage or LinkOnceODRLinkage, then
				3911	the resultant global variable will have internal linkage. AppendingLinkage
				3912	concatenates together all instances (in different translation units) of the
				3913	variable into a single variable but is only applicable to arrays. See the
				3914	`LLVM Language Reference <LangRef.html#modulestructure>`_ for further details
				3915	on linkage types. Optionally an initializer, a name, and the module to put
				3916	the variable into may be specified for the global variable as well.
				3917
				3918	* ``bool isConstant() const``
				3919
				3920	Returns true if this is a global variable that is known not to be modified at
				3921	runtime.
				3922
				3923	* ``bool hasInitializer()``
				3924
				3925	Returns true if this ``GlobalVariable`` has an intializer.
				3926
				3927	* ``Constant *getInitializer()``
				3928
				3929	Returns the initial value for a ``GlobalVariable``. It is not legal to call
				3930	this method if there is no initializer.
				3931
				3932	.. _BasicBlock:
				3933
				3934	The ``BasicBlock`` class
				3935	------------------------
				3936
Benjamin Kramer	9f566a5	2013-07-08 19:59:35 +0000	[diff] [blame]	3937	``#include "llvm/IR/BasicBlock.h"``
Sean Silva	beb15ca	2012-12-04 03:20:08 +0000	[diff] [blame]	3938
				3939	header source: `BasicBlock.h
				3940	<http://llvm.org/doxygen/BasicBlock_8h-source.html>`_
				3941
				3942	doxygen info: `BasicBlock Class
				3943	<http://llvm.org/doxygen/classllvm_1_1BasicBlock.html>`_
				3944
				3945	Superclass: Value_
				3946
				3947	This class represents a single entry single exit section of the code, commonly
				3948	known as a basic block by the compiler community. The ``BasicBlock`` class
				3949	maintains a list of Instruction_\ s, which form the body of the block. Matching
				3950	the language definition, the last element of this list of instructions is always
				3951	a terminator instruction (a subclass of the TerminatorInst_ class).
				3952
				3953	In addition to tracking the list of instructions that make up the block, the
				3954	``BasicBlock`` class also keeps track of the :ref:`Function <c_Function>` that
				3955	it is embedded into.
				3956
				3957	Note that ``BasicBlock``\ s themselves are Value_\ s, because they are
				3958	referenced by instructions like branches and can go in the switch tables.
				3959	``BasicBlock``\ s have type ``label``.
				3960
				3961	.. _m_BasicBlock:
				3962
				3963	Important Public Members of the ``BasicBlock`` class
				3964	^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
				3965
				3966	* ``BasicBlock(const std::string &Name = "", Function *Parent = 0)``
				3967
				3968	The ``BasicBlock`` constructor is used to create new basic blocks for
				3969	insertion into a function. The constructor optionally takes a name for the
				3970	new block, and a :ref:`Function <c_Function>` to insert it into. If the
				3971	``Parent`` parameter is specified, the new ``BasicBlock`` is automatically
				3972	inserted at the end of the specified :ref:`Function <c_Function>`, if not
				3973	specified, the BasicBlock must be manually inserted into the :ref:`Function
				3974	<c_Function>`.
				3975
				3976	* \| ``BasicBlock::iterator`` - Typedef for instruction list iterator
				3977	\| ``BasicBlock::const_iterator`` - Typedef for const_iterator.
				3978	\| ``begin()``, ``end()``, ``front()``, ``back()``,
				3979	``size()``, ``empty()``
				3980	STL-style functions for accessing the instruction list.
				3981
				3982	These methods and typedefs are forwarding functions that have the same
				3983	semantics as the standard library methods of the same names. These methods
				3984	expose the underlying instruction list of a basic block in a way that is easy
				3985	to manipulate. To get the full complement of container operations (including
				3986	operations to update the list), you must use the ``getInstList()`` method.
				3987
				3988	* ``BasicBlock::InstListType &getInstList()``
				3989
				3990	This method is used to get access to the underlying container that actually
				3991	holds the Instructions. This method must be used when there isn't a
				3992	forwarding function in the ``BasicBlock`` class for the operation that you
				3993	would like to perform. Because there are no forwarding functions for
				3994	"updating" operations, you need to use this if you want to update the contents
				3995	of a ``BasicBlock``.
				3996
				3997	* ``Function *getParent()``
				3998
				3999	Returns a pointer to :ref:`Function <c_Function>` the block is embedded into,
				4000	or a null pointer if it is homeless.
				4001
				4002	* ``TerminatorInst *getTerminator()``
				4003
				4004	Returns a pointer to the terminator instruction that appears at the end of the
				4005	``BasicBlock``. If there is no terminator instruction, or if the last
				4006	instruction in the block is not a terminator, then a null pointer is returned.
				4007
				4008	.. _Argument:
				4009
				4010	The ``Argument`` class
				4011	----------------------
				4012
				4013	This subclass of Value defines the interface for incoming formal arguments to a
				4014	function. A Function maintains a list of its formal arguments. An argument has
				4015	a pointer to the parent Function.
				4016
				4017