Blame - docs/GetElementPtr.rst - fp2-dev/platform/external/llvm

blob: f6f904b2e35e32f180f4075b1b5e61e205b90780 [file] [log] [blame]

Bill Wendling	3950e9e	2012-06-20 21:54:22 +0000	[diff] [blame]	1	.. _gep:
				2
				3	=======================================
				4	The Often Misunderstood GEP Instruction
				5	=======================================
				6
				7	.. contents::
				8	:local:
				9
				10	Introduction
				11	============
				12
				13	This document seeks to dispel the mystery and confusion surrounding LLVM's
				14	`GetElementPtr <LangRef.html#i_getelementptr>`_ (GEP) instruction. Questions
				15	about the wily GEP instruction are probably the most frequently occurring
				16	questions once a developer gets down to coding with LLVM. Here we lay out the
				17	sources of confusion and show that the GEP instruction is really quite simple.
				18
				19	Address Computation
				20	===================
				21
				22	When people are first confronted with the GEP instruction, they tend to relate
				23	it to known concepts from other programming paradigms, most notably C array
				24	indexing and field selection. GEP closely resembles C array indexing and field
				25	selection, however it's is a little different and this leads to the following
				26	questions.
				27
				28	What is the first index of the GEP instruction?
				29	-----------------------------------------------
				30
				31	Quick answer: The index stepping through the first operand.
				32
				33	The confusion with the first index usually arises from thinking about the
				34	GetElementPtr instruction as if it was a C index operator. They aren't the
				35	same. For example, when we write, in "C":
				36
				37	.. code-block:: c++
				38
				39	AType *Foo;
				40	...
				41	X = &Foo->F;
				42
				43	it is natural to think that there is only one index, the selection of the field
				44	``F``. However, in this example, ``Foo`` is a pointer. That pointer
				45	must be indexed explicitly in LLVM. C, on the other hand, indices through it
				46	transparently. To arrive at the same address location as the C code, you would
				47	provide the GEP instruction with two index operands. The first operand indexes
				48	through the pointer; the second operand indexes the field ``F`` of the
				49	structure, just as if you wrote:
				50
				51	.. code-block:: c++
				52
				53	X = &Foo[0].F;
				54
				55	Sometimes this question gets rephrased as:
				56
				57	.. _GEP index through first pointer:
				58
				59	*Why is it okay to index through the first pointer, but subsequent pointers
				60	won't be dereferenced?*
				61
				62	The answer is simply because memory does not have to be accessed to perform the
				63	computation. The first operand to the GEP instruction must be a value of a
				64	pointer type. The value of the pointer is provided directly to the GEP
				65	instruction as an operand without any need for accessing memory. It must,
				66	therefore be indexed and requires an index operand. Consider this example:
				67
				68	.. code-block:: c++
				69
				70	struct munger_struct {
				71	int f1;
				72	int f2;
				73	};
				74	void munge(struct munger_struct *P) {
				75	P[0].f1 = P[1].f1 + P[2].f2;
				76	}
				77	...
				78	munger_struct Array[3];
				79	...
				80	munge(Array);
				81
				82	In this "C" example, the front end compiler (llvm-gcc) will generate three GEP
				83	instructions for the three indices through "P" in the assignment statement. The
				84	function argument ``P`` will be the first operand of each of these GEP
				85	instructions. The second operand indexes through that pointer. The third
				86	operand will be the field offset into the ``struct munger_struct`` type, for
				87	either the ``f1`` or ``f2`` field. So, in LLVM assembly the ``munge`` function
				88	looks like:
				89
				90	.. code-block:: llvm
				91
				92	void %munge(%struct.munger_struct* %P) {
				93	entry:
				94	%tmp = getelementptr %struct.munger_struct* %P, i32 1, i32 0
				95	%tmp = load i32* %tmp
				96	%tmp6 = getelementptr %struct.munger_struct* %P, i32 2, i32 1
				97	%tmp7 = load i32* %tmp6
				98	%tmp8 = add i32 %tmp7, %tmp
				99	%tmp9 = getelementptr %struct.munger_struct* %P, i32 0, i32 0
				100	store i32 %tmp8, i32* %tmp9
				101	ret void
				102	}
				103
				104	In each case the first operand is the pointer through which the GEP instruction
				105	starts. The same is true whether the first operand is an argument, allocated
				106	memory, or a global variable.
				107
				108	To make this clear, let's consider a more obtuse example:
				109
				110	.. code-block:: llvm
				111
				112	%MyVar = uninitialized global i32
				113	...
				114	%idx1 = getelementptr i32* %MyVar, i64 0
				115	%idx2 = getelementptr i32* %MyVar, i64 1
				116	%idx3 = getelementptr i32* %MyVar, i64 2
				117
				118	These GEP instructions are simply making address computations from the base
				119	address of ``MyVar``. They compute, as follows (using C syntax):
				120
				121	.. code-block:: c++
				122
				123	idx1 = (char*) &MyVar + 0
				124	idx2 = (char*) &MyVar + 4
				125	idx3 = (char*) &MyVar + 8
				126
				127	Since the type ``i32`` is known to be four bytes long, the indices 0, 1 and 2
				128	translate into memory offsets of 0, 4, and 8, respectively. No memory is
				129	accessed to make these computations because the address of ``%MyVar`` is passed
				130	directly to the GEP instructions.
				131
				132	The obtuse part of this example is in the cases of ``%idx2`` and ``%idx3``. They
				133	result in the computation of addresses that point to memory past the end of the
				134	``%MyVar`` global, which is only one ``i32`` long, not three ``i32``\s long.
				135	While this is legal in LLVM, it is inadvisable because any load or store with
				136	the pointer that results from these GEP instructions would produce undefined
				137	results.
				138
				139	Why is the extra 0 index required?
				140	----------------------------------
				141
				142	Quick answer: there are no superfluous indices.
				143
				144	This question arises most often when the GEP instruction is applied to a global
				145	variable which is always a pointer type. For example, consider this:
				146
				147	.. code-block:: llvm
				148
				149	%MyStruct = uninitialized global { float*, i32 }
				150	...
				151	%idx = getelementptr { float, i32 } %MyStruct, i64 0, i32 1
				152
				153	The GEP above yields an ``i32*`` by indexing the ``i32`` typed field of the
				154	structure ``%MyStruct``. When people first look at it, they wonder why the ``i64
				155	0`` index is needed. However, a closer inspection of how globals and GEPs work
				156	reveals the need. Becoming aware of the following facts will dispel the
				157	confusion:
				158
				159	#. The type of ``%MyStruct`` is not ``{ float, i32 }`` but rather ``{ float,
				160	i32 }*``. That is, ``%MyStruct`` is a pointer to a structure containing a
				161	pointer to a ``float`` and an ``i32``.
				162
				163	#. Point #1 is evidenced by noticing the type of the first operand of the GEP
				164	instruction (``%MyStruct``) which is ``{ float, i32 }``.
				165
				166	#. The first index, ``i64 0`` is required to step over the global variable
				167	``%MyStruct``. Since the first argument to the GEP instruction must always
				168	be a value of pointer type, the first index steps through that pointer. A
				169	value of 0 means 0 elements offset from that pointer.
				170
				171	#. The second index, ``i32 1`` selects the second field of the structure (the
				172	``i32``).
				173
				174	What is dereferenced by GEP?
				175	----------------------------
				176
				177	Quick answer: nothing.
				178
				179	The GetElementPtr instruction dereferences nothing. That is, it doesn't access
				180	memory in any way. That's what the Load and Store instructions are for. GEP is
				181	only involved in the computation of addresses. For example, consider this:
				182
				183	.. code-block:: llvm
				184
				185	%MyVar = uninitialized global { [40 x i32 ]* }
				186	...
				187	%idx = getelementptr { [40 x i32]* }* %MyVar, i64 0, i32 0, i64 0, i64 17
				188
				189	In this example, we have a global variable, ``%MyVar`` that is a pointer to a
				190	structure containing a pointer to an array of 40 ints. The GEP instruction seems
				191	to be accessing the 18th integer of the structure's array of ints. However, this
				192	is actually an illegal GEP instruction. It won't compile. The reason is that the
				193	pointer in the structure <i>must</i> be dereferenced in order to index into the
				194	array of 40 ints. Since the GEP instruction never accesses memory, it is
				195	illegal.
				196
				197	In order to access the 18th integer in the array, you would need to do the
				198	following:
				199
				200	.. code-block:: llvm
				201
				202	%idx = getelementptr { [40 x i32]* }* %, i64 0, i32 0
				203	%arr = load [40 x i32]** %idx
				204	%idx = getelementptr [40 x i32]* %arr, i64 0, i64 17
				205
				206	In this case, we have to load the pointer in the structure with a load
				207	instruction before we can index into the array. If the example was changed to:
				208
				209	.. code-block:: llvm
				210
				211	%MyVar = uninitialized global { [40 x i32 ] }
				212	...
				213	%idx = getelementptr { [40 x i32] }*, i64 0, i32 0, i64 17
				214
				215	then everything works fine. In this case, the structure does not contain a
				216	pointer and the GEP instruction can index through the global variable, into the
				217	first field of the structure and access the 18th ``i32`` in the array there.
				218
				219	Why don't GEP x,0,0,1 and GEP x,1 alias?
				220	----------------------------------------
				221
				222	Quick Answer: They compute different address locations.
				223
				224	If you look at the first indices in these GEP instructions you find that they
				225	are different (0 and 1), therefore the address computation diverges with that
				226	index. Consider this example:
				227
				228	.. code-block:: llvm
				229
				230	%MyVar = global { [10 x i32 ] }
				231	%idx1 = getelementptr { [10 x i32 ] }* %MyVar, i64 0, i32 0, i64 1
				232	%idx2 = getelementptr { [10 x i32 ] }* %MyVar, i64 1
				233
				234	In this example, ``idx1`` computes the address of the second integer in the
				235	array that is in the structure in ``%MyVar``, that is ``MyVar+4``. The type of
				236	``idx1`` is ``i32``. However, ``idx2`` computes the address of the next*
				237	structure after ``%MyVar``. The type of ``idx2`` is ``{ [10 x i32] }*`` and its
				238	value is equivalent to ``MyVar + 40`` because it indexes past the ten 4-byte
				239	integers in ``MyVar``. Obviously, in such a situation, the pointers don't
				240	alias.
				241
				242	Why do GEP x,1,0,0 and GEP x,1 alias?
				243	-------------------------------------
				244
				245	Quick Answer: They compute the same address location.
				246
				247	These two GEP instructions will compute the same address because indexing
				248	through the 0th element does not change the address. However, it does change the
				249	type. Consider this example:
				250
				251	.. code-block:: llvm
				252
				253	%MyVar = global { [10 x i32 ] }
				254	%idx1 = getelementptr { [10 x i32 ] }* %MyVar, i64 1, i32 0, i64 0
				255	%idx2 = getelementptr { [10 x i32 ] }* %MyVar, i64 1
				256
				257	In this example, the value of ``%idx1`` is ``%MyVar+40`` and its type is
				258	``i32*``. The value of ``%idx2`` is also ``MyVar+40`` but its type is ``{ [10 x
				259	i32] }*``.
				260
				261	Can GEP index into vector elements?
				262	-----------------------------------
				263
				264	This hasn't always been forcefully disallowed, though it's not recommended. It
				265	leads to awkward special cases in the optimizers, and fundamental inconsistency
				266	in the IR. In the future, it will probably be outright disallowed.
				267
				268	What effect do address spaces have on GEPs?
				269	-------------------------------------------
				270
				271	None, except that the address space qualifier on the first operand pointer type
				272	always matches the address space qualifier on the result type.
				273
				274	How is GEP different from ``ptrtoint``, arithmetic, and ``inttoptr``?
				275	---------------------------------------------------------------------
				276
				277	It's very similar; there are only subtle differences.
				278
				279	With ptrtoint, you have to pick an integer type. One approach is to pick i64;
				280	this is safe on everything LLVM supports (LLVM internally assumes pointers are
				281	never wider than 64 bits in many places), and the optimizer will actually narrow
				282	the i64 arithmetic down to the actual pointer size on targets which don't
				283	support 64-bit arithmetic in most cases. However, there are some cases where it
				284	doesn't do this. With GEP you can avoid this problem.
				285
				286	Also, GEP carries additional pointer aliasing rules. It's invalid to take a GEP
				287	from one object, address into a different separately allocated object, and
				288	dereference it. IR producers (front-ends) must follow this rule, and consumers
				289	(optimizers, specifically alias analysis) benefit from being able to rely on
				290	it. See the `Rules`_ section for more information.
				291
				292	And, GEP is more concise in common cases.
				293
				294	However, for the underlying integer computation implied, there is no
				295	difference.
				296
				297
				298	I'm writing a backend for a target which needs custom lowering for GEP. How do I do this?
				299	-----------------------------------------------------------------------------------------
				300
				301	You don't. The integer computation implied by a GEP is target-independent.
				302	Typically what you'll need to do is make your backend pattern-match expressions
				303	trees involving ADD, MUL, etc., which are what GEP is lowered into. This has the
				304	advantage of letting your code work correctly in more cases.
				305
				306	GEP does use target-dependent parameters for the size and layout of data types,
				307	which targets can customize.
				308
				309	If you require support for addressing units which are not 8 bits, you'll need to
				310	fix a lot of code in the backend, with GEP lowering being only a small piece of
				311	the overall picture.
				312
				313	How does VLA addressing work with GEPs?
				314	---------------------------------------
				315
				316	GEPs don't natively support VLAs. LLVM's type system is entirely static, and GEP
				317	address computations are guided by an LLVM type.
				318
				319	VLA indices can be implemented as linearized indices. For example, an expression
				320	like ``X[a][b][c]``, must be effectively lowered into a form like
				321	``X[am+bn+c]``, so that it appears to the GEP as a single-dimensional array
				322	reference.
				323
				324	This means if you want to write an analysis which understands array indices and
				325	you want to support VLAs, your code will have to be prepared to reverse-engineer
				326	the linearization. One way to solve this problem is to use the ScalarEvolution
				327	library, which always presents VLA and non-VLA indexing in the same manner.
				328
				329	.. _Rules:
				330
				331	Rules
				332	=====
				333
				334	What happens if an array index is out of bounds?
				335	------------------------------------------------
				336
				337	There are two senses in which an array index can be out of bounds.
				338
				339	First, there's the array type which comes from the (static) type of the first
				340	operand to the GEP. Indices greater than the number of elements in the
				341	corresponding static array type are valid. There is no problem with out of
				342	bounds indices in this sense. Indexing into an array only depends on the size of
				343	the array element, not the number of elements.
				344
				345	A common example of how this is used is arrays where the size is not known.
				346	It's common to use array types with zero length to represent these. The fact
				347	that the static type says there are zero elements is irrelevant; it's perfectly
				348	valid to compute arbitrary element indices, as the computation only depends on
				349	the size of the array element, not the number of elements. Note that zero-sized
				350	arrays are not a special case here.
				351
				352	This sense is unconnected with ``inbounds`` keyword. The ``inbounds`` keyword is
				353	designed to describe low-level pointer arithmetic overflow conditions, rather
				354	than high-level array indexing rules.
				355
				356	Analysis passes which wish to understand array indexing should not assume that
				357	the static array type bounds are respected.
				358
				359	The second sense of being out of bounds is computing an address that's beyond
				360	the actual underlying allocated object.
				361
				362	With the ``inbounds`` keyword, the result value of the GEP is undefined if the
				363	address is outside the actual underlying allocated object and not the address
				364	one-past-the-end.
				365
				366	Without the ``inbounds`` keyword, there are no restrictions on computing
				367	out-of-bounds addresses. Obviously, performing a load or a store requires an
				368	address of allocated and sufficiently aligned memory. But the GEP itself is only
				369	concerned with computing addresses.
				370
				371	Can array indices be negative?
				372	------------------------------
				373
				374	Yes. This is basically a special case of array indices being out of bounds.
				375
				376	Can I compare two values computed with GEPs?
				377	--------------------------------------------
				378
				379	Yes. If both addresses are within the same allocated object, or
				380	one-past-the-end, you'll get the comparison result you expect. If either is
				381	outside of it, integer arithmetic wrapping may occur, so the comparison may not
				382	be meaningful.
				383
				384	Can I do GEP with a different pointer type than the type of the underlying object?
				385	----------------------------------------------------------------------------------
				386
				387	Yes. There are no restrictions on bitcasting a pointer value to an arbitrary
				388	pointer type. The types in a GEP serve only to define the parameters for the
				389	underlying integer computation. They need not correspond with the actual type of
				390	the underlying object.
				391
				392	Furthermore, loads and stores don't have to use the same types as the type of
				393	the underlying object. Types in this context serve only to specify memory size
				394	and alignment. Beyond that there are merely a hint to the optimizer indicating
				395	how the value will likely be used.
				396
				397	Can I cast an object's address to integer and add it to null?
				398	-------------------------------------------------------------
				399
				400	You can compute an address that way, but if you use GEP to do the add, you can't
				401	use that pointer to actually access the object, unless the object is managed
				402	outside of LLVM.
				403
				404	The underlying integer computation is sufficiently defined; null has a defined
				405	value --- zero --- and you can add whatever value you want to it.
				406
				407	However, it's invalid to access (load from or store to) an LLVM-aware object
				408	with such a pointer. This includes ``GlobalVariables``, ``Allocas``, and objects
				409	pointed to by noalias pointers.
				410
				411	If you really need this functionality, you can do the arithmetic with explicit
				412	integer instructions, and use inttoptr to convert the result to an address. Most
				413	of GEP's special aliasing rules do not apply to pointers computed from ptrtoint,
				414	arithmetic, and inttoptr sequences.
				415
				416	Can I compute the distance between two objects, and add that value to one address to compute the other address?
				417	---------------------------------------------------------------------------------------------------------------
				418
				419	As with arithmetic on null, You can use GEP to compute an address that way, but
				420	you can't use that pointer to actually access the object if you do, unless the
				421	object is managed outside of LLVM.
				422
				423	Also as above, ptrtoint and inttoptr provide an alternative way to do this which
				424	do not have this restriction.
				425
				426	Can I do type-based alias analysis on LLVM IR?
				427	----------------------------------------------
				428
				429	You can't do type-based alias analysis using LLVM's built-in type system,
				430	because LLVM has no restrictions on mixing types in addressing, loads or stores.
				431
				432	LLVM's type-based alias analysis pass uses metadata to describe a different type
				433	system (such as the C type system), and performs type-based aliasing on top of
				434	that. Further details are in the `language reference <LangRef.html#tbaa>`_.
				435
				436	What happens if a GEP computation overflows?
				437	--------------------------------------------
				438
				439	If the GEP lacks the ``inbounds`` keyword, the value is the result from
				440	evaluating the implied two's complement integer computation. However, since
				441	there's no guarantee of where an object will be allocated in the address space,
				442	such values have limited meaning.
				443
				444	If the GEP has the ``inbounds`` keyword, the result value is undefined (a "trap
				445	value") if the GEP overflows (i.e. wraps around the end of the address space).
				446
				447	As such, there are some ramifications of this for inbounds GEPs: scales implied
				448	by array/vector/pointer indices are always known to be "nsw" since they are
				449	signed values that are scaled by the element size. These values are also
				450	allowed to be negative (e.g. "``gep i32 *%P, i32 -1``") but the pointer itself
				451	is logically treated as an unsigned value. This means that GEPs have an
				452	asymmetric relation between the pointer base (which is treated as unsigned) and
				453	the offset applied to it (which is treated as signed). The result of the
				454	additions within the offset calculation cannot have signed overflow, but when
				455	applied to the base pointer, there can be signed overflow.
				456
				457	How can I tell if my front-end is following the rules?
				458	------------------------------------------------------
				459
				460	There is currently no checker for the getelementptr rules. Currently, the only
				461	way to do this is to manually check each place in your front-end where
				462	GetElementPtr operators are created.
				463
				464	It's not possible to write a checker which could find all rule violations
				465	statically. It would be possible to write a checker which works by instrumenting
				466	the code with dynamic checks though. Alternatively, it would be possible to
				467	write a static checker which catches a subset of possible problems. However, no
				468	such checker exists today.
				469
				470	Rationale
				471	=========
				472
				473	Why is GEP designed this way?
				474	-----------------------------
				475
				476	The design of GEP has the following goals, in rough unofficial order of
				477	priority:
				478
				479	* Support C, C-like languages, and languages which can be conceptually lowered
				480	into C (this covers a lot).
				481
				482	* Support optimizations such as those that are common in C compilers. In
				483	particular, GEP is a cornerstone of LLVM's `pointer aliasing
				484	model <LangRef.html#pointeraliasing>`_.
				485
				486	* Provide a consistent method for computing addresses so that address
				487	computations don't need to be a part of load and store instructions in the IR.
				488
				489	* Support non-C-like languages, to the extent that it doesn't interfere with
				490	other goals.
				491
				492	* Minimize target-specific information in the IR.
				493
				494	Why do struct member indices always use ``i32``?
				495	------------------------------------------------
				496
				497	The specific type i32 is probably just a historical artifact, however it's wide
				498	enough for all practical purposes, so there's been no need to change it. It
				499	doesn't necessarily imply i32 address arithmetic; it's just an identifier which
				500	identifies a field in a struct. Requiring that all struct indices be the same
				501	reduces the range of possibilities for cases where two GEPs are effectively the
				502	same but have distinct operand types.
				503
				504	What's an uglygep?
				505	------------------
				506
				507	Some LLVM optimizers operate on GEPs by internally lowering them into more
				508	primitive integer expressions, which allows them to be combined with other
				509	integer expressions and/or split into multiple separate integer expressions. If
				510	they've made non-trivial changes, translating back into LLVM IR can involve
				511	reverse-engineering the structure of the addressing in order to fit it into the
				512	static type of the original first operand. It isn't always possibly to fully
				513	reconstruct this structure; sometimes the underlying addressing doesn't
				514	correspond with the static type at all. In such cases the optimizer instead will
				515	emit a GEP with the base pointer casted to a simple address-unit pointer, using
				516	the name "uglygep". This isn't pretty, but it's just as valid, and it's
				517	sufficient to preserve the pointer aliasing guarantees that GEP provides.
				518
				519	Summary
				520	=======
				521
				522	In summary, here's some things to always remember about the GetElementPtr
				523	instruction:
				524
				525
				526	#. The GEP instruction never accesses memory, it only provides pointer
				527	computations.
				528
				529	#. The first operand to the GEP instruction is always a pointer and it must be
				530	indexed.
				531
				532	#. There are no superfluous indices for the GEP instruction.
				533
				534	#. Trailing zero indices are superfluous for pointer aliasing, but not for the
				535	types of the pointers.
				536
				537	#. Leading zero indices are not superfluous for pointer aliasing nor the types
				538	of the pointers.