blob: f80c0673fdd1fc89e6b9a640534c155557fb2e1a [file] [log] [blame]
Chris Lattner00c992d2007-11-03 08:55:29 +00001<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01//EN"
2 "http://www.w3.org/TR/html4/strict.dtd">
3
4<html>
5<head>
6 <title>Kaleidoscope: Extending the Language: Mutable Variables / SSA
7 construction</title>
8 <meta http-equiv="Content-Type" content="text/html; charset=utf-8">
9 <meta name="author" content="Chris Lattner">
10 <link rel="stylesheet" href="../llvm.css" type="text/css">
11</head>
12
13<body>
14
15<div class="doc_title">Kaleidoscope: Extending the Language: Mutable Variables</div>
16
17<div class="doc_author">
18 <p>Written by <a href="mailto:sabre@nondot.org">Chris Lattner</a></p>
19</div>
20
21<!-- *********************************************************************** -->
22<div class="doc_section"><a name="intro">Part 7 Introduction</a></div>
23<!-- *********************************************************************** -->
24
25<div class="doc_text">
26
27<p>Welcome to Part 7 of the "<a href="index.html">Implementing a language with
28LLVM</a>" tutorial. In parts 1 through 6, we've built a very respectable,
29albeit simple, <a
30href="http://en.wikipedia.org/wiki/Functional_programming">functional
31programming language</a>. In our journey, we learned some parsing techniques,
32how to build and represent an AST, how to build LLVM IR, and how to optimize
33the resultant code and JIT compile it.</p>
34
35<p>While Kaleidoscope is interesting as a functional language, this makes it
36"too easy" to generate LLVM IR for it. In particular, a functional language
37makes it very easy to build LLVM IR directly in <a
38href="http://en.wikipedia.org/wiki/Static_single_assignment_form">SSA form</a>.
39Since LLVM requires that the input code be in SSA form, this is a very nice
40property and it is often unclear to newcomers how to generate code for an
41imperative language with mutable variables.</p>
42
43<p>The short (and happy) summary of this chapter is that there is no need for
44your front-end to build SSA form: LLVM provides highly tuned and well tested
45support for this, though the way it works is a bit unexpected for some.</p>
46
47</div>
48
49<!-- *********************************************************************** -->
50<div class="doc_section"><a name="why">Why is this a hard problem?</a></div>
51<!-- *********************************************************************** -->
52
53<div class="doc_text">
54
55<p>
56To understand why mutable variables cause complexities in SSA construction,
57consider this extremely simple C example:
58</p>
59
60<div class="doc_code">
61<pre>
62int G, H;
63int test(_Bool Condition) {
64 int X;
65 if (Condition)
66 X = G;
67 else
68 X = H;
69 return X;
70}
71</pre>
72</div>
73
74<p>In this case, we have the variable "X", whose value depends on the path
75executed in the program. Because there are two different possible values for X
76before the return instruction, a PHI node is inserted to merge the two values.
77The LLVM IR that we want for this example looks like this:</p>
78
79<div class="doc_code">
80<pre>
81@G = weak global i32 0 ; type of @G is i32*
82@H = weak global i32 0 ; type of @H is i32*
83
84define i32 @test(i1 %Condition) {
85entry:
86 br i1 %Condition, label %cond_true, label %cond_false
87
88cond_true:
89 %X.0 = load i32* @G
90 br label %cond_next
91
92cond_false:
93 %X.1 = load i32* @H
94 br label %cond_next
95
96cond_next:
97 %X.2 = phi i32 [ %X.1, %cond_false ], [ %X.0, %cond_true ]
98 ret i32 %X.2
99}
100</pre>
101</div>
102
103<p>In this example, the loads from the G and H global variables are explicit in
104the LLVM IR, and they live in the then/else branches of the if statement
105(cond_true/cond_false). In order to merge the incoming values, the X.2 phi node
106in the cond_next block selects the right value to use based on where control
107flow is coming from: if control flow comes from the cond_false block, X.2 gets
108the value of X.1. Alternatively, if control flow comes from cond_tree, it gets
109the value of X.0. The intent of this chapter is not to explain the details of
110SSA form. For more information, see one of the many <a
111href="http://en.wikipedia.org/wiki/Static_single_assignment_form">online
112references</a>.</p>
113
114<p>The question for this article is "who places phi nodes when lowering
115assignments to mutable variables?". The issue here is that LLVM
116<em>requires</em> that its IR be in SSA form: there is no "non-ssa" mode for it.
117However, SSA construction requires non-trivial algorithms and data structures,
118so it is inconvenient and wasteful for every front-end to have to reproduce this
119logic.</p>
120
121</div>
122
123<!-- *********************************************************************** -->
124<div class="doc_section"><a name="memory">Memory in LLVM</a></div>
125<!-- *********************************************************************** -->
126
127<div class="doc_text">
128
129<p>The 'trick' here is that while LLVM does require all register values to be
130in SSA form, it does not require (or permit) memory objects to be in SSA form.
131In the example above, note that the loads from G and H are direct accesses to
132G and H: they are not renamed or versioned. This differs from some other
133compiler systems, which does try to version memory objects. In LLVM, instead of
134encoding dataflow analysis of memory into the LLVM IR, it is handled with <a
135href="../WritingAnLLVMPass.html">Analysis Passes</a> which are computed on
136demand.</p>
137
138<p>
139With this in mind, the high-level idea is that we want to make a stack variable
140(which lives in memory, because it is on the stack) for each mutable object in
141a function. To take advantage of this trick, we need to talk about how LLVM
142represents stack variables.
143</p>
144
145<p>In LLVM, all memory accesses are explicit with load/store instructions, and
146it is carefully designed to not have (or need) an "address-of" operator. Notice
147how the type of the @G/@H global variables is actually "i32*" even though the
148variable is defined as "i32". What this means is that @G defines <em>space</em>
149for an i32 in the global data area, but its <em>name</em> actually refers to the
150address for that space. Stack variables work the same way, but instead of being
151declared with global variable definitions, they are declared with the
152<a href="../LangRef.html#i_alloca">LLVM alloca instruction</a>:</p>
153
154<div class="doc_code">
155<pre>
156define i32 @test(i1 %Condition) {
157entry:
158 %X = alloca i32 ; type of %X is i32*.
159 ...
160 %tmp = load i32* %X ; load the stack value %X from the stack.
161 %tmp2 = add i32 %tmp, 1 ; increment it
162 store i32 %tmp2, i32* %X ; store it back
163 ...
164</pre>
165</div>
166
167<p>This code shows an example of how you can declare and manipulate a stack
168variable in the LLVM IR. Stack memory allocated with the alloca instruction is
169fully general: you can pass the address of the stack slot to functions, you can
170store it in other variables, etc. In our example above, we could rewrite the
171example to use the alloca technique to avoid using a PHI node:</p>
172
173<div class="doc_code">
174<pre>
175@G = weak global i32 0 ; type of @G is i32*
176@H = weak global i32 0 ; type of @H is i32*
177
178define i32 @test(i1 %Condition) {
179entry:
180 %X = alloca i32 ; type of %X is i32*.
181 br i1 %Condition, label %cond_true, label %cond_false
182
183cond_true:
184 %X.0 = load i32* @G
185 store i32 %X.0, i32* %X ; Update X
186 br label %cond_next
187
188cond_false:
189 %X.1 = load i32* @H
190 store i32 %X.1, i32* %X ; Update X
191 br label %cond_next
192
193cond_next:
194 %X.2 = load i32* %X ; Read X
195 ret i32 %X.2
196}
197</pre>
198</div>
199
200<p>With this, we have discovered a way to handle arbitrary mutable variables
201without the need to create Phi nodes at all:</p>
202
203<ol>
204<li>Each mutable variable becomes a stack allocation.</li>
205<li>Each read of the variable becomes a load from the stack.</li>
206<li>Each update of the variable becomes a store to the stack.</li>
207<li>Taking the address of a variable just uses the stack address directly.</li>
208</ol>
209
210<p>While this solution has solved our immediate problem, it introduced another
211one: we have now apparently introduced a lot of stack traffic for very simple
212and common operations, a major performance problem. Fortunately for us, the
213LLVM optimizer has a highly-tuned optimization pass named "mem2reg" that handles
214this case, promoting allocas like this into SSA registers, inserting Phi nodes
215as appropriate. If you run this example through the pass, for example, you'll
216get:</p>
217
218<div class="doc_code">
219<pre>
220$ <b>llvm-as &lt; example.ll | opt -mem2reg | llvm-dis</b>
221@G = weak global i32 0
222@H = weak global i32 0
223
224define i32 @test(i1 %Condition) {
225entry:
226 br i1 %Condition, label %cond_true, label %cond_false
227
228cond_true:
229 %X.0 = load i32* @G
230 br label %cond_next
231
232cond_false:
233 %X.1 = load i32* @H
234 br label %cond_next
235
236cond_next:
237 %X.01 = phi i32 [ %X.1, %cond_false ], [ %X.0, %cond_true ]
238 ret i32 %X.01
239}
240</pre>
241
242<p>The mem2reg pass is guaranteed to work, and
243
244which cases.
245</p>
246
247<p>The final question you may be asking is: should I bother with this nonsense
248for my front-end? Wouldn't it be better if I just did SSA construction
249directly, avoiding use of the mem2reg optimization pass?
250
251Proven, well tested, debug info, etc.
252</p>
253</div>
254
255
256<!-- *********************************************************************** -->
257<div class="doc_section"><a name="code">Full Code Listing</a></div>
258<!-- *********************************************************************** -->
259
260<div class="doc_text">
261
262<p>
263Here is the complete code listing for our running example, enhanced with the
264if/then/else and for expressions.. To build this example, use:
265</p>
266
267<div class="doc_code">
268<pre>
269 # Compile
270 g++ -g toy.cpp `llvm-config --cppflags --ldflags --libs core jit native` -O3 -o toy
271 # Run
272 ./toy
273</pre>
274</div>
275
276<p>Here is the code:</p>
277
278<div class="doc_code">
279<pre>
280</pre>
281</div>
282
283</div>
284
285<!-- *********************************************************************** -->
286<hr>
287<address>
288 <a href="http://jigsaw.w3.org/css-validator/check/referer"><img
289 src="http://jigsaw.w3.org/css-validator/images/vcss" alt="Valid CSS!"></a>
290 <a href="http://validator.w3.org/check/referer"><img
291 src="http://www.w3.org/Icons/valid-html401" alt="Valid HTML 4.01!"></a>
292
293 <a href="mailto:sabre@nondot.org">Chris Lattner</a><br>
294 <a href="http://llvm.org">The LLVM Compiler Infrastructure</a><br>
295 Last modified: $Date: 2007-10-17 11:05:13 -0700 (Wed, 17 Oct 2007) $
296</address>
297</body>
298</html>