blob: a5df96da77296f1e62d94db436cd9bf625f30a78 [file] [log] [blame]
Reid Klecknera534a382013-12-19 02:14:12 +00001==========================================
2Design and Usage of the InAlloca Attribute
3==========================================
4
5Introduction
6============
7
8.. Warning:: This feature is unstable and not fully implemented.
9
Reid Kleckner60d3a832014-01-16 22:59:24 +000010The :ref:`inalloca <attr_inalloca>` attribute is designed to allow
11taking the address of an aggregate argument that is being passed by
12value through memory. Primarily, this feature is required for
13compatibility with the Microsoft C++ ABI. Under that ABI, class
14instances that are passed by value are constructed directly into
15argument stack memory. Prior to the addition of inalloca, calls in LLVM
16were indivisible instructions. There was no way to perform intermediate
17work, such as object construction, between the first stack adjustment
18and the final control transfer. With inalloca, all arguments passed in
19memory are modelled as a single alloca, which can be stored to prior to
20the call. Unfortunately, this complicated feature comes with a large
21set of restrictions designed to bound the lifetime of the argument
22memory around the call.
Reid Klecknera534a382013-12-19 02:14:12 +000023
24For now, it is recommended that frontends and optimizers avoid producing
25this construct, primarily because it forces the use of a base pointer.
26This feature may grow in the future to allow general mid-level
27optimization, but for now, it should be regarded as less efficient than
28passing by value with a copy.
29
30Intended Usage
31==============
32
Reid Kleckner60d3a832014-01-16 22:59:24 +000033The example below is the intended LLVM IR lowering for some C++ code
34that passes a default-constructed ``Foo`` object to ``g`` in the 32-bit
35Microsoft C++ ABI.
36
37.. code-block:: c++
38
39 // Foo is non-trivial.
40 struct Foo { int a, b; Foo(); ~Foo(); Foo(const &Foo); };
41 void g(Foo a, Foo b);
42 void f() {
43 f(1, Foo(), 3);
44 }
Reid Klecknera534a382013-12-19 02:14:12 +000045
46.. code-block:: llvm
47
Reid Kleckner60d3a832014-01-16 22:59:24 +000048 %struct.Foo = type { i32, i32 }
49 %callframe.f = type { %struct.Foo, %struct.Foo }
Reid Klecknera534a382013-12-19 02:14:12 +000050 declare void @Foo_ctor(%Foo* %this)
Reid Kleckner60d3a832014-01-16 22:59:24 +000051 declare void @Foo_dtor(%Foo* %this)
52 declare void @g(%Foo* inalloca %memargs)
Reid Klecknera534a382013-12-19 02:14:12 +000053
54 define void @f() {
Reid Kleckner60d3a832014-01-16 22:59:24 +000055 entry:
Reid Klecknera534a382013-12-19 02:14:12 +000056 %base = call i8* @llvm.stacksave()
Reid Kleckner60d3a832014-01-16 22:59:24 +000057 %memargs = alloca %callframe.f
58 %b = getelementptr %callframe.f*, i32 0
59 %a = getelementptr %callframe.f*, i32 1
60 call void @Foo_ctor(%struct.Foo* %b)
61
62 ; If a's ctor throws, we must destruct b.
63 invoke void @Foo_ctor(%struct.Foo* %arg1)
Reid Klecknera534a382013-12-19 02:14:12 +000064 to label %invoke.cont unwind %invoke.unwind
65
66 invoke.cont:
Reid Kleckner60d3a832014-01-16 22:59:24 +000067 store i32 1, i32* %arg0
68 call void @g(%callframe.f* inalloca %memargs)
Reid Klecknera534a382013-12-19 02:14:12 +000069 call void @llvm.stackrestore(i8* %base)
70 ...
71
72 invoke.unwind:
Reid Kleckner60d3a832014-01-16 22:59:24 +000073 call void @Foo_dtor(%struct.Foo* %b)
Reid Klecknera534a382013-12-19 02:14:12 +000074 call void @llvm.stackrestore(i8* %base)
75 ...
76 }
77
Reid Kleckner60d3a832014-01-16 22:59:24 +000078To avoid stack leaks, the frontend saves the current stack pointer with
79a call to :ref:`llvm.stacksave <int_stacksave>`. Then, it allocates the
80argument stack space with alloca and calls the default constructor. The
81default constructor could throw an exception, so the frontend has to
82create a landing pad. The frontend has to destroy the already
83constructed argument ``b`` before restoring the stack pointer. If the
84constructor does not unwind, ``g`` is called. In the Microsoft C++ ABI,
85``g`` will destroy its arguments, and then the stack is restored in
86``f``.
Reid Klecknera534a382013-12-19 02:14:12 +000087
88Design Considerations
89=====================
90
91Lifetime
92--------
93
94The biggest design consideration for this feature is object lifetime.
95We cannot model the arguments as static allocas in the entry block,
Reid Kleckner60d3a832014-01-16 22:59:24 +000096because all calls need to use the memory at the top of the stack to pass
97arguments. We cannot vend pointers to that memory at function entry
98because after code generation they will alias.
99
100The rule against allocas between argument allocations and the call site
101avoids this problem, but it creates a cleanup problem. Cleanup and
102lifetime is handled explicitly with stack save and restore calls. In
103the future, we may want to introduce a new construct such as ``freea``
104or ``afree`` to make it clear that this stack adjusting cleanup is less
105powerful than a full stack save and restore.
Reid Klecknera534a382013-12-19 02:14:12 +0000106
107Nested Calls and Copy Elision
108-----------------------------
109
Reid Kleckner60d3a832014-01-16 22:59:24 +0000110We also want to be able to support copy elision into these argument
111slots. This means we have to support multiple live argument
112allocations.
113
114Consider the evaluation of:
115
116.. code-block:: c++
117
118 // Foo is non-trivial.
119 struct Foo { int a; Foo(); Foo(const &Foo); ~Foo(); };
120 Foo bar(Foo b);
121 int main() {
122 bar(bar(Foo()));
123 }
124
125In this case, we want to be able to elide copies into ``bar``'s argument
126slots. That means we need to have more than one set of argument frames
127active at the same time. First, we need to allocate the frame for the
128outer call so we can pass it in as the hidden struct return pointer to
129the middle call. Then we do the same for the middle call, allocating a
130frame and passing its address to ``Foo``'s default constructor. By
131wrapping the evaluation of the inner ``bar`` with stack save and
132restore, we can have multiple overlapping active call frames.
Reid Klecknera534a382013-12-19 02:14:12 +0000133
134Callee-cleanup Calling Conventions
135----------------------------------
136
137Another wrinkle is the existence of callee-cleanup conventions. On
138Windows, all methods and many other functions adjust the stack to clear
139the memory used to pass their arguments. In some sense, this means that
140the allocas are automatically cleared by the call. However, LLVM
141instead models this as a write of undef to all of the inalloca values
142passed to the call instead of a stack adjustment. Frontends should
143still restore the stack pointer to avoid a stack leak.
144
145Exceptions
146----------
147
148There is also the possibility of an exception. If argument evaluation
149or copy construction throws an exception, the landing pad must do
150cleanup, which includes adjusting the stack pointer to avoid a stack
151leak. This means the cleanup of the stack memory cannot be tied to the
152call itself. There needs to be a separate IR-level instruction that can
153perform independent cleanup of arguments.
154
155Efficiency
156----------
157
158Eventually, it should be possible to generate efficient code for this
159construct. In particular, using inalloca should not require a base
160pointer. If the backend can prove that all points in the CFG only have
161one possible stack level, then it can address the stack directly from
162the stack pointer. While this is not yet implemented, the plan is that
163the inalloca attribute should not change much, but the frontend IR
164generation recommendations may change.