Blame - llvm/docs/InAlloca.rst - toolchain/llvm-project

blob: b1779874e0e2a0436fdf962bc85782ed9190365b [file] [log] [blame]

Reid Kleckner	a534a38	2013-12-19 02:14:12 +0000	[diff] [blame]	1	==========================================
				2	Design and Usage of the InAlloca Attribute
				3	==========================================
				4
				5	Introduction
				6	============
				7
				8	.. Warning:: This feature is unstable and not fully implemented.
				9
				10	The :ref:`attr_inalloca` attribute is designed to allow taking the
				11	address of an aggregate argument that is being passed by value through
				12	memory. Primarily, this feature is required for compatibility with the
				13	Microsoft C++ ABI. Under that ABI, class instances that are passed by
				14	value are constructed directly into argument stack memory. Prior to the
				15	addition of inalloca, calls in LLVM were indivisible instructions.
				16	There was no way to perform intermediate work, such as object
				17	construction, between the first stack adjustment and the final control
				18	transfer. With inalloca, each argument is modelled as an alloca, which
				19	can be stored to independently of the call. Unfortunately, this
				20	complicated feature comes with a large set of restrictions designed to
				21	bound the lifetime of the argument memory around the call, which are
				22	explained in this document.
				23
				24	For now, it is recommended that frontends and optimizers avoid producing
				25	this construct, primarily because it forces the use of a base pointer.
				26	This feature may grow in the future to allow general mid-level
				27	optimization, but for now, it should be regarded as less efficient than
				28	passing by value with a copy.
				29
				30	Intended Usage
				31	==============
				32
				33	In the example below, ``f`` is attempting to pass a default-constructed
				34	``Foo`` object to ``g`` by value.
				35
				36	.. code-block:: llvm
				37
				38	%Foo = type { i32, i32 }
				39	declare void @Foo_ctor(%Foo* %this)
				40	declare void @g(%Foo* inalloca %arg)
				41
				42	define void @f() {
				43	...
				44
				45	bb1:
				46	%base = call i8* @llvm.stacksave()
				47	%arg = alloca %Foo
				48	invoke void @Foo_ctor(%Foo* %arg)
				49	to label %invoke.cont unwind %invoke.unwind
				50
				51	invoke.cont:
				52	call void @g(%Foo* inalloca %arg)
				53	call void @llvm.stackrestore(i8* %base)
				54	...
				55
				56	invoke.unwind:
				57	call void @llvm.stackrestore(i8* %base)
				58	...
				59	}
				60
				61	The alloca in this example is dynamic, meaning it is not in the entry
				62	block, and it can be executed more than once. Due to the restrictions
				63	against allocas between an alloca used with inalloca and its associated
				64	call site, all allocas used with inalloca are considered dynamic.
				65
				66	To avoid any stack leakage, the frontend saves the current stack pointer
				67	with a call to :ref:`llvm.stacksave <int_stacksave>`. Then, it
				68	allocates the argument stack space with alloca and calls the default
				69	constructor. One important consideration is that the default
				70	constructor could throw an exception, so the frontend has to create a
				71	landing pad. At this point, if there were any other inalloca arguments,
				72	the frontend would have to destruct them before restoring the stack
				73	pointer. If the constructor does not unwind, ``g`` is called, and then
				74	the stack is restored.
				75
				76	Design Considerations
				77	=====================
				78
				79	Lifetime
				80	--------
				81
				82	The biggest design consideration for this feature is object lifetime.
				83	We cannot model the arguments as static allocas in the entry block,
				84	because all calls need to use the memory that is at the end of the call
				85	frame to pass arguments. We cannot vend pointers to that memory at
				86	function entry because after code generation they will alias. In the
				87	current design, the rule against allocas between the inalloca alloca
				88	values and the call site avoids this problem, but it creates a cleanup
				89	problem. Cleanup and lifetime is handled explicitly with stack save and
				90	restore calls. In the future, we may be able to avoid this by using
				91	:ref:`llvm.lifetime.start <int_lifestart>` and :ref:`llvm.lifetime.end
				92	<int_lifeend>` instead.
				93
				94	Nested Calls and Copy Elision
				95	-----------------------------
				96
				97	The next consideration is the ability for the frontend to perform copy
				98	elision in the face of nested calls. Consider the evaluation of
				99	``foo(foo(Bar()))``, where ``foo`` takes and returns a ``Bar`` object by
				100	value and ``Bar`` has non-trivial constructors. In this case, we want
				101	to be able to elide copies into ``foo``'s argument slots. That means we
				102	need to have more than one set of argument frames active at the same
				103	time. First, we need to allocate the frame for the outer call so we can
				104	pass it in as the hidden struct return pointer to the middle call. Then
				105	we do the same for the middle call, allocating a frame and passing its
				106	address to ``Bar``'s default constructor. By wrapping the evaluation of
				107	the inner ``foo`` with stack save and restore, we can have multiple
				108	overlapping active call frames.
				109
				110	Callee-cleanup Calling Conventions
				111	----------------------------------
				112
				113	Another wrinkle is the existence of callee-cleanup conventions. On
				114	Windows, all methods and many other functions adjust the stack to clear
				115	the memory used to pass their arguments. In some sense, this means that
				116	the allocas are automatically cleared by the call. However, LLVM
				117	instead models this as a write of undef to all of the inalloca values
				118	passed to the call instead of a stack adjustment. Frontends should
				119	still restore the stack pointer to avoid a stack leak.
				120
				121	Exceptions
				122	----------
				123
				124	There is also the possibility of an exception. If argument evaluation
				125	or copy construction throws an exception, the landing pad must do
				126	cleanup, which includes adjusting the stack pointer to avoid a stack
				127	leak. This means the cleanup of the stack memory cannot be tied to the
				128	call itself. There needs to be a separate IR-level instruction that can
				129	perform independent cleanup of arguments.
				130
				131	Efficiency
				132	----------
				133
				134	Eventually, it should be possible to generate efficient code for this
				135	construct. In particular, using inalloca should not require a base
				136	pointer. If the backend can prove that all points in the CFG only have
				137	one possible stack level, then it can address the stack directly from
				138	the stack pointer. While this is not yet implemented, the plan is that
				139	the inalloca attribute should not change much, but the frontend IR
				140	generation recommendations may change.