blob: bea97e51dada9ced8b840f0a868b79c3afb05f76 [file] [log] [blame]
Chris Lattner00c992d2007-11-03 08:55:29 +00001<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01//EN"
2 "http://www.w3.org/TR/html4/strict.dtd">
3
4<html>
5<head>
6 <title>Kaleidoscope: Extending the Language: Mutable Variables / SSA
7 construction</title>
8 <meta http-equiv="Content-Type" content="text/html; charset=utf-8">
9 <meta name="author" content="Chris Lattner">
10 <link rel="stylesheet" href="../llvm.css" type="text/css">
11</head>
12
13<body>
14
15<div class="doc_title">Kaleidoscope: Extending the Language: Mutable Variables</div>
16
Chris Lattner128eb862007-11-05 19:06:59 +000017<ul>
18<li>Chapter 7
19 <ol>
20 <li><a href="#intro">Chapter 7 Introduction</a></li>
21 <li><a href="#why">Why is this a hard problem?</a></li>
22 <li><a href="#memory">Memory in LLVM</a></li>
23 <li><a href="#kalvars">Mutable Variables in Kaleidoscope</a></li>
24 <li><a href="#adjustments">Adjusting Existing Variables for
25 Mutation</a></li>
26 <li><a href="#assignment">New Assignment Operator</a></li>
27 <li><a href="#localvars">User-defined Local Variables</a></li>
28 <li><a href="#code">Full Code Listing</a></li>
29 </ol>
30</li>
31</ul>
32
Chris Lattner00c992d2007-11-03 08:55:29 +000033<div class="doc_author">
34 <p>Written by <a href="mailto:sabre@nondot.org">Chris Lattner</a></p>
35</div>
36
37<!-- *********************************************************************** -->
Chris Lattner128eb862007-11-05 19:06:59 +000038<div class="doc_section"><a name="intro">Chapter 7 Introduction</a></div>
Chris Lattner00c992d2007-11-03 08:55:29 +000039<!-- *********************************************************************** -->
40
41<div class="doc_text">
42
Chris Lattner128eb862007-11-05 19:06:59 +000043<p>Welcome to Chapter 7 of the "<a href="index.html">Implementing a language
44with LLVM</a>" tutorial. In chapters 1 through 6, we've built a very
45respectable, albeit simple, <a
Chris Lattner00c992d2007-11-03 08:55:29 +000046href="http://en.wikipedia.org/wiki/Functional_programming">functional
47programming language</a>. In our journey, we learned some parsing techniques,
48how to build and represent an AST, how to build LLVM IR, and how to optimize
49the resultant code and JIT compile it.</p>
50
51<p>While Kaleidoscope is interesting as a functional language, this makes it
52"too easy" to generate LLVM IR for it. In particular, a functional language
53makes it very easy to build LLVM IR directly in <a
54href="http://en.wikipedia.org/wiki/Static_single_assignment_form">SSA form</a>.
55Since LLVM requires that the input code be in SSA form, this is a very nice
56property and it is often unclear to newcomers how to generate code for an
57imperative language with mutable variables.</p>
58
59<p>The short (and happy) summary of this chapter is that there is no need for
60your front-end to build SSA form: LLVM provides highly tuned and well tested
61support for this, though the way it works is a bit unexpected for some.</p>
62
63</div>
64
65<!-- *********************************************************************** -->
66<div class="doc_section"><a name="why">Why is this a hard problem?</a></div>
67<!-- *********************************************************************** -->
68
69<div class="doc_text">
70
71<p>
72To understand why mutable variables cause complexities in SSA construction,
73consider this extremely simple C example:
74</p>
75
76<div class="doc_code">
77<pre>
78int G, H;
79int test(_Bool Condition) {
80 int X;
81 if (Condition)
82 X = G;
83 else
84 X = H;
85 return X;
86}
87</pre>
88</div>
89
90<p>In this case, we have the variable "X", whose value depends on the path
91executed in the program. Because there are two different possible values for X
92before the return instruction, a PHI node is inserted to merge the two values.
93The LLVM IR that we want for this example looks like this:</p>
94
95<div class="doc_code">
96<pre>
97@G = weak global i32 0 ; type of @G is i32*
98@H = weak global i32 0 ; type of @H is i32*
99
100define i32 @test(i1 %Condition) {
101entry:
102 br i1 %Condition, label %cond_true, label %cond_false
103
104cond_true:
105 %X.0 = load i32* @G
106 br label %cond_next
107
108cond_false:
109 %X.1 = load i32* @H
110 br label %cond_next
111
112cond_next:
113 %X.2 = phi i32 [ %X.1, %cond_false ], [ %X.0, %cond_true ]
114 ret i32 %X.2
115}
116</pre>
117</div>
118
119<p>In this example, the loads from the G and H global variables are explicit in
120the LLVM IR, and they live in the then/else branches of the if statement
121(cond_true/cond_false). In order to merge the incoming values, the X.2 phi node
122in the cond_next block selects the right value to use based on where control
123flow is coming from: if control flow comes from the cond_false block, X.2 gets
124the value of X.1. Alternatively, if control flow comes from cond_tree, it gets
125the value of X.0. The intent of this chapter is not to explain the details of
126SSA form. For more information, see one of the many <a
127href="http://en.wikipedia.org/wiki/Static_single_assignment_form">online
128references</a>.</p>
129
130<p>The question for this article is "who places phi nodes when lowering
131assignments to mutable variables?". The issue here is that LLVM
132<em>requires</em> that its IR be in SSA form: there is no "non-ssa" mode for it.
133However, SSA construction requires non-trivial algorithms and data structures,
134so it is inconvenient and wasteful for every front-end to have to reproduce this
135logic.</p>
136
137</div>
138
139<!-- *********************************************************************** -->
140<div class="doc_section"><a name="memory">Memory in LLVM</a></div>
141<!-- *********************************************************************** -->
142
143<div class="doc_text">
144
145<p>The 'trick' here is that while LLVM does require all register values to be
146in SSA form, it does not require (or permit) memory objects to be in SSA form.
147In the example above, note that the loads from G and H are direct accesses to
148G and H: they are not renamed or versioned. This differs from some other
Chris Lattner2e5d07e2007-11-04 19:42:13 +0000149compiler systems, which do try to version memory objects. In LLVM, instead of
Chris Lattner00c992d2007-11-03 08:55:29 +0000150encoding dataflow analysis of memory into the LLVM IR, it is handled with <a
151href="../WritingAnLLVMPass.html">Analysis Passes</a> which are computed on
152demand.</p>
153
154<p>
155With this in mind, the high-level idea is that we want to make a stack variable
156(which lives in memory, because it is on the stack) for each mutable object in
157a function. To take advantage of this trick, we need to talk about how LLVM
158represents stack variables.
159</p>
160
161<p>In LLVM, all memory accesses are explicit with load/store instructions, and
162it is carefully designed to not have (or need) an "address-of" operator. Notice
163how the type of the @G/@H global variables is actually "i32*" even though the
164variable is defined as "i32". What this means is that @G defines <em>space</em>
165for an i32 in the global data area, but its <em>name</em> actually refers to the
166address for that space. Stack variables work the same way, but instead of being
167declared with global variable definitions, they are declared with the
168<a href="../LangRef.html#i_alloca">LLVM alloca instruction</a>:</p>
169
170<div class="doc_code">
171<pre>
172define i32 @test(i1 %Condition) {
173entry:
174 %X = alloca i32 ; type of %X is i32*.
175 ...
176 %tmp = load i32* %X ; load the stack value %X from the stack.
177 %tmp2 = add i32 %tmp, 1 ; increment it
178 store i32 %tmp2, i32* %X ; store it back
179 ...
180</pre>
181</div>
182
183<p>This code shows an example of how you can declare and manipulate a stack
184variable in the LLVM IR. Stack memory allocated with the alloca instruction is
185fully general: you can pass the address of the stack slot to functions, you can
186store it in other variables, etc. In our example above, we could rewrite the
187example to use the alloca technique to avoid using a PHI node:</p>
188
189<div class="doc_code">
190<pre>
191@G = weak global i32 0 ; type of @G is i32*
192@H = weak global i32 0 ; type of @H is i32*
193
194define i32 @test(i1 %Condition) {
195entry:
196 %X = alloca i32 ; type of %X is i32*.
197 br i1 %Condition, label %cond_true, label %cond_false
198
199cond_true:
200 %X.0 = load i32* @G
201 store i32 %X.0, i32* %X ; Update X
202 br label %cond_next
203
204cond_false:
205 %X.1 = load i32* @H
206 store i32 %X.1, i32* %X ; Update X
207 br label %cond_next
208
209cond_next:
210 %X.2 = load i32* %X ; Read X
211 ret i32 %X.2
212}
213</pre>
214</div>
215
216<p>With this, we have discovered a way to handle arbitrary mutable variables
217without the need to create Phi nodes at all:</p>
218
219<ol>
220<li>Each mutable variable becomes a stack allocation.</li>
221<li>Each read of the variable becomes a load from the stack.</li>
222<li>Each update of the variable becomes a store to the stack.</li>
223<li>Taking the address of a variable just uses the stack address directly.</li>
224</ol>
225
226<p>While this solution has solved our immediate problem, it introduced another
227one: we have now apparently introduced a lot of stack traffic for very simple
228and common operations, a major performance problem. Fortunately for us, the
229LLVM optimizer has a highly-tuned optimization pass named "mem2reg" that handles
230this case, promoting allocas like this into SSA registers, inserting Phi nodes
231as appropriate. If you run this example through the pass, for example, you'll
232get:</p>
233
234<div class="doc_code">
235<pre>
236$ <b>llvm-as &lt; example.ll | opt -mem2reg | llvm-dis</b>
237@G = weak global i32 0
238@H = weak global i32 0
239
240define i32 @test(i1 %Condition) {
241entry:
242 br i1 %Condition, label %cond_true, label %cond_false
243
244cond_true:
245 %X.0 = load i32* @G
246 br label %cond_next
247
248cond_false:
249 %X.1 = load i32* @H
250 br label %cond_next
251
252cond_next:
253 %X.01 = phi i32 [ %X.1, %cond_false ], [ %X.0, %cond_true ]
254 ret i32 %X.01
255}
256</pre>
Chris Lattnere7198312007-11-03 22:22:30 +0000257</div>
Chris Lattner00c992d2007-11-03 08:55:29 +0000258
Chris Lattnere7198312007-11-03 22:22:30 +0000259<p>The mem2reg pass implements the standard "iterated dominator frontier"
260algorithm for constructing SSA form and has a number of optimizations that speed
261up very common degenerate cases. mem2reg really is the answer for dealing with
262mutable variables, and we highly recommend that you depend on it. Note that
263mem2reg only works on variables in certain circumstances:</p>
Chris Lattner00c992d2007-11-03 08:55:29 +0000264
Chris Lattnere7198312007-11-03 22:22:30 +0000265<ol>
266<li>mem2reg is alloca-driven: it looks for allocas and if it can handle them, it
267promotes them. It does not apply to global variables or heap allocations.</li>
Chris Lattner00c992d2007-11-03 08:55:29 +0000268
Chris Lattnere7198312007-11-03 22:22:30 +0000269<li>mem2reg only looks for alloca instructions in the entry block of the
270function. Being in the entry block guarantees that the alloca is only executed
271once, which makes analysis simpler.</li>
Chris Lattner00c992d2007-11-03 08:55:29 +0000272
Chris Lattnere7198312007-11-03 22:22:30 +0000273<li>mem2reg only promotes allocas whose uses are direct loads and stores. If
274the address of the stack object is passed to a function, or if any funny pointer
275arithmetic is involved, the alloca will not be promoted.</li>
276
Chris Lattnera56b22d2007-11-05 17:45:54 +0000277<li>mem2reg only works on allocas of <a
278href="../LangRef.html#t_classifications">first class</a>
279values (such as pointers, scalars and vectors), and only if the array size
Chris Lattnere7198312007-11-03 22:22:30 +0000280of the allocation is 1 (or missing in the .ll file). mem2reg is not capable of
281promoting structs or arrays to registers. Note that the "scalarrepl" pass is
282more powerful and can promote structs, "unions", and arrays in many cases.</li>
283
284</ol>
285
286<p>
287All of these properties are easy to satisfy for most imperative languages, and
Chris Lattner2e5d07e2007-11-04 19:42:13 +0000288we'll illustrate this below with Kaleidoscope. The final question you may be
Chris Lattnere7198312007-11-03 22:22:30 +0000289asking is: should I bother with this nonsense for my front-end? Wouldn't it be
290better if I just did SSA construction directly, avoiding use of the mem2reg
291optimization pass? In short, we strongly recommend that use you this technique
292for building SSA form, unless there is an extremely good reason not to. Using
293this technique is:</p>
294
295<ul>
296<li>Proven and well tested: llvm-gcc and clang both use this technique for local
297mutable variables. As such, the most common clients of LLVM are using this to
298handle a bulk of their variables. You can be sure that bugs are found fast and
299fixed early.</li>
300
301<li>Extremely Fast: mem2reg has a number of special cases that make it fast in
302common cases as well as fully general. For example, it has fast-paths for
303variables that are only used in a single block, variables that only have one
304assignment point, good heuristics to avoid insertion of unneeded phi nodes, etc.
305</li>
306
307<li>Needed for debug info generation: <a href="../SourceLevelDebugging.html">
308Debug information in LLVM</a> relies on having the address of the variable
309exposed to attach debug info to it. This technique dovetails very naturally
310with this style of debug info.</li>
311</ul>
312
313<p>If nothing else, this makes it much easier to get your front-end up and
314running, and is very simple to implement. Lets extend Kaleidoscope with mutable
315variables now!
Chris Lattner00c992d2007-11-03 08:55:29 +0000316</p>
Chris Lattner62a709d2007-11-05 00:23:57 +0000317
Chris Lattner00c992d2007-11-03 08:55:29 +0000318</div>
319
Chris Lattner62a709d2007-11-05 00:23:57 +0000320<!-- *********************************************************************** -->
321<div class="doc_section"><a name="kalvars">Mutable Variables in
322Kaleidoscope</a></div>
323<!-- *********************************************************************** -->
324
325<div class="doc_text">
326
327<p>Now that we know the sort of problem we want to tackle, lets see what this
328looks like in the context of our little Kaleidoscope language. We're going to
329add two features:</p>
330
331<ol>
332<li>The ability to mutate variables with the '=' operator.</li>
333<li>The ability to define new variables.</li>
334</ol>
335
336<p>While the first item is really what this is about, we only have variables
337for incoming arguments and for induction variables, and redefining them only
338goes so far :). Also, the ability to define new variables is a
339useful thing regardless of whether you will be mutating them. Here's a
340motivating example that shows how we could use these:</p>
341
342<div class="doc_code">
343<pre>
344# Define ':' for sequencing: as a low-precedence operator that ignores operands
345# and just returns the RHS.
346def binary : 1 (x y) y;
347
348# Recursive fib, we could do this before.
349def fib(x)
350 if (x &lt; 3) then
351 1
352 else
353 fib(x-1)+fib(x-2);
354
355# Iterative fib.
356def fibi(x)
357 <b>var a = 1, b = 1, c in</b>
358 (for i = 3, i &;t; x in
359 <b>c = a + b</b> :
360 <b>a = b</b> :
361 <b>b = c</b>) :
362 b;
363
364# Call it.
365fibi(10);
366</pre>
367</div>
368
369<p>
370In order to mutate variables, we have to change our existing variables to use
371the "alloca trick". Once we have that, we'll add our new operator, then extend
372Kaleidoscope to support new variable definitions.
373</p>
374
375</div>
376
377<!-- *********************************************************************** -->
378<div class="doc_section"><a name="adjustments">Adjusting Existing Variables for
379Mutation</a></div>
380<!-- *********************************************************************** -->
381
382<div class="doc_text">
383
384<p>
385The symbol table in Kaleidoscope is managed at code generation time by the
386'<tt>NamedValues</tt>' map. This map currently keeps track of the LLVM "Value*"
387that holds the double value for the named variable. In order to support
388mutation, we need to change this slightly, so that it <tt>NamedValues</tt> holds
389the <em>memory location</em> of the variable in question. Note that this
390change is a refactoring: it changes the structure of the code, but does not
391(by itself) change the behavior of the compiler. All of these changes are
392isolated in the Kaleidoscope code generator.</p>
393
394<p>
395At this point in Kaleidoscope's development, it only supports variables for two
396things: incoming arguments to functions and the induction variable of 'for'
397loops. For consistency, we'll allow mutation of these variables in addition to
398other user-defined variables. This means that these will both need memory
399locations.
400</p>
401
402<p>To start our transformation of Kaleidoscope, we'll change the NamedValues
403map to map to AllocaInst* instead of Value*. Once we do this, the C++ compiler
404will tell use what parts of the code we need to update:</p>
405
406<div class="doc_code">
407<pre>
408static std::map&lt;std::string, AllocaInst*&gt; NamedValues;
409</pre>
410</div>
411
412<p>Also, since we will need to create these alloca's, we'll use a helper
413function that ensures that the allocas are created in the entry block of the
414function:</p>
415
416<div class="doc_code">
417<pre>
418/// CreateEntryBlockAlloca - Create an alloca instruction in the entry block of
419/// the function. This is used for mutable variables etc.
420static AllocaInst *CreateEntryBlockAlloca(Function *TheFunction,
421 const std::string &amp;VarName) {
422 LLVMBuilder TmpB(&amp;TheFunction-&gt;getEntryBlock(),
423 TheFunction-&gt;getEntryBlock().begin());
424 return TmpB.CreateAlloca(Type::DoubleTy, 0, VarName.c_str());
425}
426</pre>
427</div>
428
429<p>This funny looking code creates an LLVMBuilder object that is pointing at
430the first instruction (.begin()) of the entry block. It then creates an alloca
431with the expected name and returns it. Because all values in Kaleidoscope are
432doubles, there is no need to pass in a type to use.</p>
433
434<p>With this in place, the first functionality change we want to make is to
435variable references. In our new scheme, variables live on the stack, so code
436generating a reference to them actually needs to produce a load from the stack
437slot:</p>
438
439<div class="doc_code">
440<pre>
441Value *VariableExprAST::Codegen() {
442 // Look this variable up in the function.
443 Value *V = NamedValues[Name];
444 if (V == 0) return ErrorV("Unknown variable name");
445
446 // Load the value.
447 return Builder.CreateLoad(V, Name.c_str());
448}
449</pre>
450</div>
451
452<p>As you can see, this is pretty straight-forward. Next we need to update the
453things that define the variables to set up the alloca. We'll start with
454<tt>ForExprAST::Codegen</tt> (see the <a href="#code">full code listing</a> for
455the unabridged code):</p>
456
457<div class="doc_code">
458<pre>
459 Function *TheFunction = Builder.GetInsertBlock()->getParent();
460
461 <b>// Create an alloca for the variable in the entry block.
462 AllocaInst *Alloca = CreateEntryBlockAlloca(TheFunction, VarName);</b>
463
464 // Emit the start code first, without 'variable' in scope.
465 Value *StartVal = Start-&gt;Codegen();
466 if (StartVal == 0) return 0;
467
468 <b>// Store the value into the alloca.
469 Builder.CreateStore(StartVal, Alloca);</b>
470 ...
471
472 // Compute the end condition.
473 Value *EndCond = End-&gt;Codegen();
474 if (EndCond == 0) return EndCond;
475
476 <b>// Reload, increment, and restore the alloca. This handles the case where
477 // the body of the loop mutates the variable.
478 Value *CurVar = Builder.CreateLoad(Alloca);
479 Value *NextVar = Builder.CreateAdd(CurVar, StepVal, "nextvar");
480 Builder.CreateStore(NextVar, Alloca);</b>
481 ...
482</pre>
483</div>
484
485<p>This code is virtually identical to the code <a
486href="LangImpl5.html#forcodegen">before we allowed mutable variables</a>. The
487big difference is that we no longer have to construct a PHI node, and we use
488load/store to access the variable as needed.</p>
489
490<p>To support mutable argument variables, we need to also make allocas for them.
491The code for this is also pretty simple:</p>
492
493<div class="doc_code">
494<pre>
495/// CreateArgumentAllocas - Create an alloca for each argument and register the
496/// argument in the symbol table so that references to it will succeed.
497void PrototypeAST::CreateArgumentAllocas(Function *F) {
498 Function::arg_iterator AI = F-&gt;arg_begin();
499 for (unsigned Idx = 0, e = Args.size(); Idx != e; ++Idx, ++AI) {
500 // Create an alloca for this variable.
501 AllocaInst *Alloca = CreateEntryBlockAlloca(F, Args[Idx]);
502
503 // Store the initial value into the alloca.
504 Builder.CreateStore(AI, Alloca);
505
506 // Add arguments to variable symbol table.
507 NamedValues[Args[Idx]] = Alloca;
508 }
509}
510</pre>
511</div>
512
513<p>For each argument, we make an alloca, store the input value to the function
514into the alloca, and register the alloca as the memory location for the
515argument. This method gets invoked by <tt>FunctionAST::Codegen</tt> right after
516it sets up the entry block for the function.</p>
517
518<p>The final missing piece is adding the 'mem2reg' pass, which allows us to get
519good codegen once again:</p>
520
521<div class="doc_code">
522<pre>
523 // Set up the optimizer pipeline. Start with registering info about how the
524 // target lays out data structures.
525 OurFPM.add(new TargetData(*TheExecutionEngine-&gt;getTargetData()));
526 <b>// Promote allocas to registers.
527 OurFPM.add(createPromoteMemoryToRegisterPass());</b>
528 // Do simple "peephole" optimizations and bit-twiddling optzns.
529 OurFPM.add(createInstructionCombiningPass());
530 // Reassociate expressions.
531 OurFPM.add(createReassociatePass());
532</pre>
533</div>
534
535<p>It is interesting to see what the code looks like before and after the
536mem2reg optimization runs. For example, this is the before/after code for our
537recursive fib. Before the optimization:</p>
538
539<div class="doc_code">
540<pre>
541define double @fib(double %x) {
542entry:
543 <b>%x1 = alloca double
544 store double %x, double* %x1
545 %x2 = load double* %x1</b>
546 %multmp = fcmp ult double %x2, 3.000000e+00
547 %booltmp = uitofp i1 %multmp to double
548 %ifcond = fcmp one double %booltmp, 0.000000e+00
549 br i1 %ifcond, label %then, label %else
550
551then: ; preds = %entry
552 br label %ifcont
553
554else: ; preds = %entry
555 <b>%x3 = load double* %x1</b>
556 %subtmp = sub double %x3, 1.000000e+00
557 %calltmp = call double @fib( double %subtmp )
558 <b>%x4 = load double* %x1</b>
559 %subtmp5 = sub double %x4, 2.000000e+00
560 %calltmp6 = call double @fib( double %subtmp5 )
561 %addtmp = add double %calltmp, %calltmp6
562 br label %ifcont
563
564ifcont: ; preds = %else, %then
565 %iftmp = phi double [ 1.000000e+00, %then ], [ %addtmp, %else ]
566 ret double %iftmp
567}
568</pre>
569</div>
570
571<p>Here there is only one variable (x, the input argument) but you can still
572see the extremely simple-minded code generation strategy we are using. In the
573entry block, an alloca is created, and the initial input value is stored into
574it. Each reference to the variable does a reload from the stack. Also, note
575that we didn't modify the if/then/else expression, so it still inserts a PHI
576node. While we could make an alloca for it, it is actually easier to create a
577PHI node for it, so we still just make the PHI.</p>
578
579<p>Here is the code after the mem2reg pass runs:</p>
580
581<div class="doc_code">
582<pre>
583define double @fib(double %x) {
584entry:
585 %multmp = fcmp ult double <b>%x</b>, 3.000000e+00
586 %booltmp = uitofp i1 %multmp to double
587 %ifcond = fcmp one double %booltmp, 0.000000e+00
588 br i1 %ifcond, label %then, label %else
589
590then:
591 br label %ifcont
592
593else:
594 %subtmp = sub double <b>%x</b>, 1.000000e+00
595 %calltmp = call double @fib( double %subtmp )
596 %subtmp5 = sub double <b>%x</b>, 2.000000e+00
597 %calltmp6 = call double @fib( double %subtmp5 )
598 %addtmp = add double %calltmp, %calltmp6
599 br label %ifcont
600
601ifcont: ; preds = %else, %then
602 %iftmp = phi double [ 1.000000e+00, %then ], [ %addtmp, %else ]
603 ret double %iftmp
604}
605</pre>
606</div>
607
608<p>This is a trivial case for mem2reg, since there are no redefinitions of the
609variable. The point of showing this is to calm your tension about inserting
610such blatent inefficiencies :).</p>
611
612<p>After the rest of the optimizers run, we get:</p>
613
614<div class="doc_code">
615<pre>
616define double @fib(double %x) {
617entry:
618 %multmp = fcmp ult double %x, 3.000000e+00
619 %booltmp = uitofp i1 %multmp to double
620 %ifcond = fcmp ueq double %booltmp, 0.000000e+00
621 br i1 %ifcond, label %else, label %ifcont
622
623else:
624 %subtmp = sub double %x, 1.000000e+00
625 %calltmp = call double @fib( double %subtmp )
626 %subtmp5 = sub double %x, 2.000000e+00
627 %calltmp6 = call double @fib( double %subtmp5 )
628 %addtmp = add double %calltmp, %calltmp6
629 ret double %addtmp
630
631ifcont:
632 ret double 1.000000e+00
633}
634</pre>
635</div>
636
637<p>Here we see that the simplifycfg pass decided to clone the return instruction
638into the end of the 'else' block. This allowed it to eliminate some branches
639and the PHI node.</p>
640
641<p>Now that all symbol table references are updated to use stack variables,
642we'll add the assignment operator.</p>
643
644</div>
645
646<!-- *********************************************************************** -->
647<div class="doc_section"><a name="assignment">New Assignment Operator</a></div>
648<!-- *********************************************************************** -->
649
650<div class="doc_text">
651
652<p>With our current framework, adding a new assignment operator is really
653simple. We will parse it just like any other binary operator, but handle it
654internally (instead of allowing the user to define it). The first step is to
655set a precedence:</p>
656
657<div class="doc_code">
658<pre>
659 int main() {
660 // Install standard binary operators.
661 // 1 is lowest precedence.
662 <b>BinopPrecedence['='] = 2;</b>
663 BinopPrecedence['&lt;'] = 10;
664 BinopPrecedence['+'] = 20;
665 BinopPrecedence['-'] = 20;
666</pre>
667</div>
668
669<p>Now that the parser knows the precedence of the binary operator, it takes
670care of all the parsing and AST generation. We just need to implement codegen
671for the assignment operator. This looks like:</p>
672
673<div class="doc_code">
674<pre>
675Value *BinaryExprAST::Codegen() {
676 // Special case '=' because we don't want to emit the LHS as an expression.
677 if (Op == '=') {
678 // Assignment requires the LHS to be an identifier.
679 VariableExprAST *LHSE = dynamic_cast&lt;VariableExprAST*&gt;(LHS);
680 if (!LHSE)
681 return ErrorV("destination of '=' must be a variable");
682</pre>
683</div>
684
685<p>Unlike the rest of the binary operators, our assignment operator doesn't
686follow the "emit LHS, emit RHS, do computation" model. As such, it is handled
687as a special case before the other binary operators are handled. The other
688strange thing about it is that it requires the LHS to be a variable directly.
689</p>
690
691<div class="doc_code">
692<pre>
693 // Codegen the RHS.
694 Value *Val = RHS-&gt;Codegen();
695 if (Val == 0) return 0;
696
697 // Look up the name.
698 Value *Variable = NamedValues[LHSE-&gt;getName()];
699 if (Variable == 0) return ErrorV("Unknown variable name");
700
701 Builder.CreateStore(Val, Variable);
702 return Val;
703 }
704 ...
705</pre>
706</div>
707
708<p>Once it has the variable, codegen'ing the assignment is straight-forward:
709we emit the RHS of the assignment, create a store, and return the computed
710value. Returning a value allows for chained assignments like "X = (Y = Z)".</p>
711
712<p>Now that we have an assignment operator, we can mutate loop variables and
713arguments. For example, we can now run code like this:</p>
714
715<div class="doc_code">
716<pre>
717# Function to print a double.
718extern printd(x);
719
720# Define ':' for sequencing: as a low-precedence operator that ignores operands
721# and just returns the RHS.
722def binary : 1 (x y) y;
723
724def test(x)
725 printd(x) :
726 x = 4 :
727 printd(x);
728
729test(123);
730</pre>
731</div>
732
733<p>When run, this example prints "123" and then "4", showing that we did
734actually mutate the value! Okay, we have now officially implemented our goal:
735getting this to work requires SSA construction in the general case. However,
736to be really useful, we want the ability to define our own local variables, lets
737add this next!
738</p>
739
740</div>
741
742<!-- *********************************************************************** -->
743<div class="doc_section"><a name="localvars">User-defined Local
744Variables</a></div>
745<!-- *********************************************************************** -->
746
747<div class="doc_text">
748
749<p>Adding var/in is just like any other other extensions we made to
750Kaleidoscope: we extend the lexer, the parser, the AST and the code generator.
751The first step for adding our new 'var/in' construct is to extend the lexer.
752As before, this is pretty trivial, the code looks like this:</p>
753
754<div class="doc_code">
755<pre>
756enum Token {
757 ...
758 <b>// var definition
759 tok_var = -13</b>
760...
761}
762...
763static int gettok() {
764...
765 if (IdentifierStr == "in") return tok_in;
766 if (IdentifierStr == "binary") return tok_binary;
767 if (IdentifierStr == "unary") return tok_unary;
768 <b>if (IdentifierStr == "var") return tok_var;</b>
769 return tok_identifier;
770...
771</pre>
772</div>
773
774<p>The next step is to define the AST node that we will construct. For var/in,
775it will look like this:</p>
776
777<div class="doc_code">
778<pre>
779/// VarExprAST - Expression class for var/in
780class VarExprAST : public ExprAST {
781 std::vector&lt;std::pair&lt;std::string, ExprAST*&gt; &gt; VarNames;
782 ExprAST *Body;
783public:
784 VarExprAST(const std::vector&lt;std::pair&lt;std::string, ExprAST*&gt; &gt; &amp;varnames,
785 ExprAST *body)
786 : VarNames(varnames), Body(body) {}
787
788 virtual Value *Codegen();
789};
790</pre>
791</div>
792
793<p>var/in allows a list of names to be defined all at once, and each name can
794optionally have an initializer value. As such, we capture this information in
795the VarNames vector. Also, var/in has a body, this body is allowed to access
796the variables defined by the let/in.</p>
797
798<p>With this ready, we can define the parser pieces. First thing we do is add
799it as a primary expression:</p>
800
801<div class="doc_code">
802<pre>
803/// primary
804/// ::= identifierexpr
805/// ::= numberexpr
806/// ::= parenexpr
807/// ::= ifexpr
808/// ::= forexpr
809<b>/// ::= varexpr</b>
810static ExprAST *ParsePrimary() {
811 switch (CurTok) {
812 default: return Error("unknown token when expecting an expression");
813 case tok_identifier: return ParseIdentifierExpr();
814 case tok_number: return ParseNumberExpr();
815 case '(': return ParseParenExpr();
816 case tok_if: return ParseIfExpr();
817 case tok_for: return ParseForExpr();
818 <b>case tok_var: return ParseVarExpr();</b>
819 }
820}
821</pre>
822</div>
823
824<p>Next we define ParseVarExpr:</p>
825
826<div class="doc_code">
827<pre>
Chris Lattner20a0c802007-11-05 17:54:34 +0000828/// varexpr ::= 'var' identifier ('=' expression)?
829// (',' identifier ('=' expression)?)* 'in' expression
Chris Lattner62a709d2007-11-05 00:23:57 +0000830static ExprAST *ParseVarExpr() {
831 getNextToken(); // eat the var.
832
833 std::vector&lt;std::pair&lt;std::string, ExprAST*&gt; &gt; VarNames;
834
835 // At least one variable name is required.
836 if (CurTok != tok_identifier)
837 return Error("expected identifier after var");
838</pre>
839</div>
840
841<p>The first part of this code parses the list of identifier/expr pairs into the
842local <tt>VarNames</tt> vector.
843
844<div class="doc_code">
845<pre>
846 while (1) {
847 std::string Name = IdentifierStr;
Chris Lattner20a0c802007-11-05 17:54:34 +0000848 getNextToken(); // eat identifier.
Chris Lattner62a709d2007-11-05 00:23:57 +0000849
850 // Read the optional initializer.
851 ExprAST *Init = 0;
852 if (CurTok == '=') {
853 getNextToken(); // eat the '='.
854
855 Init = ParseExpression();
856 if (Init == 0) return 0;
857 }
858
859 VarNames.push_back(std::make_pair(Name, Init));
860
861 // End of var list, exit loop.
862 if (CurTok != ',') break;
863 getNextToken(); // eat the ','.
864
865 if (CurTok != tok_identifier)
866 return Error("expected identifier list after var");
867 }
868</pre>
869</div>
870
871<p>Once all the variables are parsed, we then parse the body and create the
872AST node:</p>
873
874<div class="doc_code">
875<pre>
876 // At this point, we have to have 'in'.
877 if (CurTok != tok_in)
878 return Error("expected 'in' keyword after 'var'");
879 getNextToken(); // eat 'in'.
880
881 ExprAST *Body = ParseExpression();
882 if (Body == 0) return 0;
883
884 return new VarExprAST(VarNames, Body);
885}
886</pre>
887</div>
888
889<p>Now that we can parse and represent the code, we need to support emission of
890LLVM IR for it. This code starts out with:</p>
891
892<div class="doc_code">
893<pre>
894Value *VarExprAST::Codegen() {
895 std::vector&lt;AllocaInst *&gt; OldBindings;
896
897 Function *TheFunction = Builder.GetInsertBlock()-&gt;getParent();
898
899 // Register all variables and emit their initializer.
900 for (unsigned i = 0, e = VarNames.size(); i != e; ++i) {
901 const std::string &amp;VarName = VarNames[i].first;
902 ExprAST *Init = VarNames[i].second;
903</pre>
904</div>
905
906<p>Basically it loops over all the variables, installing them one at a time.
907For each variable we put into the symbol table, we remember the previous value
908that we replace in OldBindings.</p>
909
910<div class="doc_code">
911<pre>
912 // Emit the initializer before adding the variable to scope, this prevents
913 // the initializer from referencing the variable itself, and permits stuff
914 // like this:
915 // var a = 1 in
916 // var a = a in ... # refers to outer 'a'.
917 Value *InitVal;
918 if (Init) {
919 InitVal = Init-&gt;Codegen();
920 if (InitVal == 0) return 0;
921 } else { // If not specified, use 0.0.
922 InitVal = ConstantFP::get(Type::DoubleTy, APFloat(0.0));
923 }
924
925 AllocaInst *Alloca = CreateEntryBlockAlloca(TheFunction, VarName);
926 Builder.CreateStore(InitVal, Alloca);
927
928 // Remember the old variable binding so that we can restore the binding when
929 // we unrecurse.
930 OldBindings.push_back(NamedValues[VarName]);
931
932 // Remember this binding.
933 NamedValues[VarName] = Alloca;
934 }
935</pre>
936</div>
937
938<p>There are more comments here than code. The basic idea is that we emit the
939initializer, create the alloca, then update the symbol table to point to it.
940Once all the variables are installed in the symbol table, we evaluate the body
941of the var/in expression:</p>
942
943<div class="doc_code">
944<pre>
945 // Codegen the body, now that all vars are in scope.
946 Value *BodyVal = Body-&gt;Codegen();
947 if (BodyVal == 0) return 0;
948</pre>
949</div>
950
951<p>Finally, before returning, we restore the previous variable bindings:</p>
952
953<div class="doc_code">
954<pre>
955 // Pop all our variables from scope.
956 for (unsigned i = 0, e = VarNames.size(); i != e; ++i)
957 NamedValues[VarNames[i].first] = OldBindings[i];
958
959 // Return the body computation.
960 return BodyVal;
961}
962</pre>
963</div>
964
965<p>The end result of all of this is that we get properly scoped variable
966definitions, and we even (trivially) allow mutation of them :).</p>
967
968<p>With this, we completed what we set out to do. Our nice iterative fib
969example from the intro compiles and runs just fine. The mem2reg pass optimizes
970all of our stack variables into SSA registers, inserting PHI nodes where needed,
971and our front-end remains simple: no iterated dominator frontier computation
972anywhere in sight.</p>
973
974</div>
Chris Lattner00c992d2007-11-03 08:55:29 +0000975
976<!-- *********************************************************************** -->
977<div class="doc_section"><a name="code">Full Code Listing</a></div>
978<!-- *********************************************************************** -->
979
980<div class="doc_text">
981
982<p>
Chris Lattner62a709d2007-11-05 00:23:57 +0000983Here is the complete code listing for our running example, enhanced with mutable
984variables and var/in support. To build this example, use:
Chris Lattner00c992d2007-11-03 08:55:29 +0000985</p>
986
987<div class="doc_code">
988<pre>
989 # Compile
990 g++ -g toy.cpp `llvm-config --cppflags --ldflags --libs core jit native` -O3 -o toy
991 # Run
992 ./toy
993</pre>
994</div>
995
996<p>Here is the code:</p>
997
998<div class="doc_code">
999<pre>
Chris Lattner62a709d2007-11-05 00:23:57 +00001000#include "llvm/DerivedTypes.h"
1001#include "llvm/ExecutionEngine/ExecutionEngine.h"
1002#include "llvm/Module.h"
1003#include "llvm/ModuleProvider.h"
1004#include "llvm/PassManager.h"
1005#include "llvm/Analysis/Verifier.h"
1006#include "llvm/Target/TargetData.h"
1007#include "llvm/Transforms/Scalar.h"
1008#include "llvm/Support/LLVMBuilder.h"
1009#include &lt;cstdio&gt;
1010#include &lt;string&gt;
1011#include &lt;map&gt;
1012#include &lt;vector&gt;
1013using namespace llvm;
1014
1015//===----------------------------------------------------------------------===//
1016// Lexer
1017//===----------------------------------------------------------------------===//
1018
1019// The lexer returns tokens [0-255] if it is an unknown character, otherwise one
1020// of these for known things.
1021enum Token {
1022 tok_eof = -1,
1023
1024 // commands
1025 tok_def = -2, tok_extern = -3,
1026
1027 // primary
1028 tok_identifier = -4, tok_number = -5,
1029
1030 // control
1031 tok_if = -6, tok_then = -7, tok_else = -8,
1032 tok_for = -9, tok_in = -10,
1033
1034 // operators
1035 tok_binary = -11, tok_unary = -12,
1036
1037 // var definition
1038 tok_var = -13
1039};
1040
1041static std::string IdentifierStr; // Filled in if tok_identifier
1042static double NumVal; // Filled in if tok_number
1043
1044/// gettok - Return the next token from standard input.
1045static int gettok() {
1046 static int LastChar = ' ';
1047
1048 // Skip any whitespace.
1049 while (isspace(LastChar))
1050 LastChar = getchar();
1051
1052 if (isalpha(LastChar)) { // identifier: [a-zA-Z][a-zA-Z0-9]*
1053 IdentifierStr = LastChar;
1054 while (isalnum((LastChar = getchar())))
1055 IdentifierStr += LastChar;
1056
1057 if (IdentifierStr == "def") return tok_def;
1058 if (IdentifierStr == "extern") return tok_extern;
1059 if (IdentifierStr == "if") return tok_if;
1060 if (IdentifierStr == "then") return tok_then;
1061 if (IdentifierStr == "else") return tok_else;
1062 if (IdentifierStr == "for") return tok_for;
1063 if (IdentifierStr == "in") return tok_in;
1064 if (IdentifierStr == "binary") return tok_binary;
1065 if (IdentifierStr == "unary") return tok_unary;
1066 if (IdentifierStr == "var") return tok_var;
1067 return tok_identifier;
1068 }
1069
1070 if (isdigit(LastChar) || LastChar == '.') { // Number: [0-9.]+
1071 std::string NumStr;
1072 do {
1073 NumStr += LastChar;
1074 LastChar = getchar();
1075 } while (isdigit(LastChar) || LastChar == '.');
1076
1077 NumVal = strtod(NumStr.c_str(), 0);
1078 return tok_number;
1079 }
1080
1081 if (LastChar == '#') {
1082 // Comment until end of line.
1083 do LastChar = getchar();
1084 while (LastChar != EOF &amp;&amp; LastChar != '\n' &amp; LastChar != '\r');
1085
1086 if (LastChar != EOF)
1087 return gettok();
1088 }
1089
1090 // Check for end of file. Don't eat the EOF.
1091 if (LastChar == EOF)
1092 return tok_eof;
1093
1094 // Otherwise, just return the character as its ascii value.
1095 int ThisChar = LastChar;
1096 LastChar = getchar();
1097 return ThisChar;
1098}
1099
1100//===----------------------------------------------------------------------===//
1101// Abstract Syntax Tree (aka Parse Tree)
1102//===----------------------------------------------------------------------===//
1103
1104/// ExprAST - Base class for all expression nodes.
1105class ExprAST {
1106public:
1107 virtual ~ExprAST() {}
1108 virtual Value *Codegen() = 0;
1109};
1110
1111/// NumberExprAST - Expression class for numeric literals like "1.0".
1112class NumberExprAST : public ExprAST {
1113 double Val;
1114public:
1115 NumberExprAST(double val) : Val(val) {}
1116 virtual Value *Codegen();
1117};
1118
1119/// VariableExprAST - Expression class for referencing a variable, like "a".
1120class VariableExprAST : public ExprAST {
1121 std::string Name;
1122public:
1123 VariableExprAST(const std::string &amp;name) : Name(name) {}
1124 const std::string &amp;getName() const { return Name; }
1125 virtual Value *Codegen();
1126};
1127
1128/// UnaryExprAST - Expression class for a unary operator.
1129class UnaryExprAST : public ExprAST {
1130 char Opcode;
1131 ExprAST *Operand;
1132public:
1133 UnaryExprAST(char opcode, ExprAST *operand)
1134 : Opcode(opcode), Operand(operand) {}
1135 virtual Value *Codegen();
1136};
1137
1138/// BinaryExprAST - Expression class for a binary operator.
1139class BinaryExprAST : public ExprAST {
1140 char Op;
1141 ExprAST *LHS, *RHS;
1142public:
1143 BinaryExprAST(char op, ExprAST *lhs, ExprAST *rhs)
1144 : Op(op), LHS(lhs), RHS(rhs) {}
1145 virtual Value *Codegen();
1146};
1147
1148/// CallExprAST - Expression class for function calls.
1149class CallExprAST : public ExprAST {
1150 std::string Callee;
1151 std::vector&lt;ExprAST*&gt; Args;
1152public:
1153 CallExprAST(const std::string &amp;callee, std::vector&lt;ExprAST*&gt; &amp;args)
1154 : Callee(callee), Args(args) {}
1155 virtual Value *Codegen();
1156};
1157
1158/// IfExprAST - Expression class for if/then/else.
1159class IfExprAST : public ExprAST {
1160 ExprAST *Cond, *Then, *Else;
1161public:
1162 IfExprAST(ExprAST *cond, ExprAST *then, ExprAST *_else)
1163 : Cond(cond), Then(then), Else(_else) {}
1164 virtual Value *Codegen();
1165};
1166
1167/// ForExprAST - Expression class for for/in.
1168class ForExprAST : public ExprAST {
1169 std::string VarName;
1170 ExprAST *Start, *End, *Step, *Body;
1171public:
1172 ForExprAST(const std::string &amp;varname, ExprAST *start, ExprAST *end,
1173 ExprAST *step, ExprAST *body)
1174 : VarName(varname), Start(start), End(end), Step(step), Body(body) {}
1175 virtual Value *Codegen();
1176};
1177
1178/// VarExprAST - Expression class for var/in
1179class VarExprAST : public ExprAST {
1180 std::vector&lt;std::pair&lt;std::string, ExprAST*&gt; &gt; VarNames;
1181 ExprAST *Body;
1182public:
1183 VarExprAST(const std::vector&lt;std::pair&lt;std::string, ExprAST*&gt; &gt; &amp;varnames,
1184 ExprAST *body)
1185 : VarNames(varnames), Body(body) {}
1186
1187 virtual Value *Codegen();
1188};
1189
1190/// PrototypeAST - This class represents the "prototype" for a function,
1191/// which captures its argument names as well as if it is an operator.
1192class PrototypeAST {
1193 std::string Name;
1194 std::vector&lt;std::string&gt; Args;
1195 bool isOperator;
1196 unsigned Precedence; // Precedence if a binary op.
1197public:
1198 PrototypeAST(const std::string &amp;name, const std::vector&lt;std::string&gt; &amp;args,
1199 bool isoperator = false, unsigned prec = 0)
1200 : Name(name), Args(args), isOperator(isoperator), Precedence(prec) {}
1201
1202 bool isUnaryOp() const { return isOperator &amp;&amp; Args.size() == 1; }
1203 bool isBinaryOp() const { return isOperator &amp;&amp; Args.size() == 2; }
1204
1205 char getOperatorName() const {
1206 assert(isUnaryOp() || isBinaryOp());
1207 return Name[Name.size()-1];
1208 }
1209
1210 unsigned getBinaryPrecedence() const { return Precedence; }
1211
1212 Function *Codegen();
1213
1214 void CreateArgumentAllocas(Function *F);
1215};
1216
1217/// FunctionAST - This class represents a function definition itself.
1218class FunctionAST {
1219 PrototypeAST *Proto;
1220 ExprAST *Body;
1221public:
1222 FunctionAST(PrototypeAST *proto, ExprAST *body)
1223 : Proto(proto), Body(body) {}
1224
1225 Function *Codegen();
1226};
1227
1228//===----------------------------------------------------------------------===//
1229// Parser
1230//===----------------------------------------------------------------------===//
1231
1232/// CurTok/getNextToken - Provide a simple token buffer. CurTok is the current
1233/// token the parser it looking at. getNextToken reads another token from the
1234/// lexer and updates CurTok with its results.
1235static int CurTok;
1236static int getNextToken() {
1237 return CurTok = gettok();
1238}
1239
1240/// BinopPrecedence - This holds the precedence for each binary operator that is
1241/// defined.
1242static std::map&lt;char, int&gt; BinopPrecedence;
1243
1244/// GetTokPrecedence - Get the precedence of the pending binary operator token.
1245static int GetTokPrecedence() {
1246 if (!isascii(CurTok))
1247 return -1;
1248
1249 // Make sure it's a declared binop.
1250 int TokPrec = BinopPrecedence[CurTok];
1251 if (TokPrec &lt;= 0) return -1;
1252 return TokPrec;
1253}
1254
1255/// Error* - These are little helper functions for error handling.
1256ExprAST *Error(const char *Str) { fprintf(stderr, "Error: %s\n", Str);return 0;}
1257PrototypeAST *ErrorP(const char *Str) { Error(Str); return 0; }
1258FunctionAST *ErrorF(const char *Str) { Error(Str); return 0; }
1259
1260static ExprAST *ParseExpression();
1261
1262/// identifierexpr
Chris Lattner20a0c802007-11-05 17:54:34 +00001263/// ::= identifier
1264/// ::= identifier '(' expression* ')'
Chris Lattner62a709d2007-11-05 00:23:57 +00001265static ExprAST *ParseIdentifierExpr() {
1266 std::string IdName = IdentifierStr;
1267
Chris Lattner20a0c802007-11-05 17:54:34 +00001268 getNextToken(); // eat identifier.
Chris Lattner62a709d2007-11-05 00:23:57 +00001269
1270 if (CurTok != '(') // Simple variable ref.
1271 return new VariableExprAST(IdName);
1272
1273 // Call.
1274 getNextToken(); // eat (
1275 std::vector&lt;ExprAST*&gt; Args;
1276 if (CurTok != ')') {
1277 while (1) {
1278 ExprAST *Arg = ParseExpression();
1279 if (!Arg) return 0;
1280 Args.push_back(Arg);
1281
1282 if (CurTok == ')') break;
1283
1284 if (CurTok != ',')
1285 return Error("Expected ')'");
1286 getNextToken();
1287 }
1288 }
1289
1290 // Eat the ')'.
1291 getNextToken();
1292
1293 return new CallExprAST(IdName, Args);
1294}
1295
1296/// numberexpr ::= number
1297static ExprAST *ParseNumberExpr() {
1298 ExprAST *Result = new NumberExprAST(NumVal);
1299 getNextToken(); // consume the number
1300 return Result;
1301}
1302
1303/// parenexpr ::= '(' expression ')'
1304static ExprAST *ParseParenExpr() {
1305 getNextToken(); // eat (.
1306 ExprAST *V = ParseExpression();
1307 if (!V) return 0;
1308
1309 if (CurTok != ')')
1310 return Error("expected ')'");
1311 getNextToken(); // eat ).
1312 return V;
1313}
1314
1315/// ifexpr ::= 'if' expression 'then' expression 'else' expression
1316static ExprAST *ParseIfExpr() {
1317 getNextToken(); // eat the if.
1318
1319 // condition.
1320 ExprAST *Cond = ParseExpression();
1321 if (!Cond) return 0;
1322
1323 if (CurTok != tok_then)
1324 return Error("expected then");
1325 getNextToken(); // eat the then
1326
1327 ExprAST *Then = ParseExpression();
1328 if (Then == 0) return 0;
1329
1330 if (CurTok != tok_else)
1331 return Error("expected else");
1332
1333 getNextToken();
1334
1335 ExprAST *Else = ParseExpression();
1336 if (!Else) return 0;
1337
1338 return new IfExprAST(Cond, Then, Else);
1339}
1340
Chris Lattner20a0c802007-11-05 17:54:34 +00001341/// forexpr ::= 'for' identifier '=' expr ',' expr (',' expr)? 'in' expression
Chris Lattner62a709d2007-11-05 00:23:57 +00001342static ExprAST *ParseForExpr() {
1343 getNextToken(); // eat the for.
1344
1345 if (CurTok != tok_identifier)
1346 return Error("expected identifier after for");
1347
1348 std::string IdName = IdentifierStr;
Chris Lattner20a0c802007-11-05 17:54:34 +00001349 getNextToken(); // eat identifier.
Chris Lattner62a709d2007-11-05 00:23:57 +00001350
1351 if (CurTok != '=')
1352 return Error("expected '=' after for");
1353 getNextToken(); // eat '='.
1354
1355
1356 ExprAST *Start = ParseExpression();
1357 if (Start == 0) return 0;
1358 if (CurTok != ',')
1359 return Error("expected ',' after for start value");
1360 getNextToken();
1361
1362 ExprAST *End = ParseExpression();
1363 if (End == 0) return 0;
1364
1365 // The step value is optional.
1366 ExprAST *Step = 0;
1367 if (CurTok == ',') {
1368 getNextToken();
1369 Step = ParseExpression();
1370 if (Step == 0) return 0;
1371 }
1372
1373 if (CurTok != tok_in)
1374 return Error("expected 'in' after for");
1375 getNextToken(); // eat 'in'.
1376
1377 ExprAST *Body = ParseExpression();
1378 if (Body == 0) return 0;
1379
1380 return new ForExprAST(IdName, Start, End, Step, Body);
1381}
1382
Chris Lattner20a0c802007-11-05 17:54:34 +00001383/// varexpr ::= 'var' identifier ('=' expression)?
1384// (',' identifier ('=' expression)?)* 'in' expression
Chris Lattner62a709d2007-11-05 00:23:57 +00001385static ExprAST *ParseVarExpr() {
1386 getNextToken(); // eat the var.
1387
1388 std::vector&lt;std::pair&lt;std::string, ExprAST*&gt; &gt; VarNames;
1389
1390 // At least one variable name is required.
1391 if (CurTok != tok_identifier)
1392 return Error("expected identifier after var");
1393
1394 while (1) {
1395 std::string Name = IdentifierStr;
Chris Lattner20a0c802007-11-05 17:54:34 +00001396 getNextToken(); // eat identifier.
Chris Lattner62a709d2007-11-05 00:23:57 +00001397
1398 // Read the optional initializer.
1399 ExprAST *Init = 0;
1400 if (CurTok == '=') {
1401 getNextToken(); // eat the '='.
1402
1403 Init = ParseExpression();
1404 if (Init == 0) return 0;
1405 }
1406
1407 VarNames.push_back(std::make_pair(Name, Init));
1408
1409 // End of var list, exit loop.
1410 if (CurTok != ',') break;
1411 getNextToken(); // eat the ','.
1412
1413 if (CurTok != tok_identifier)
1414 return Error("expected identifier list after var");
1415 }
1416
1417 // At this point, we have to have 'in'.
1418 if (CurTok != tok_in)
1419 return Error("expected 'in' keyword after 'var'");
1420 getNextToken(); // eat 'in'.
1421
1422 ExprAST *Body = ParseExpression();
1423 if (Body == 0) return 0;
1424
1425 return new VarExprAST(VarNames, Body);
1426}
1427
1428
1429/// primary
1430/// ::= identifierexpr
1431/// ::= numberexpr
1432/// ::= parenexpr
1433/// ::= ifexpr
1434/// ::= forexpr
1435/// ::= varexpr
1436static ExprAST *ParsePrimary() {
1437 switch (CurTok) {
1438 default: return Error("unknown token when expecting an expression");
1439 case tok_identifier: return ParseIdentifierExpr();
1440 case tok_number: return ParseNumberExpr();
1441 case '(': return ParseParenExpr();
1442 case tok_if: return ParseIfExpr();
1443 case tok_for: return ParseForExpr();
1444 case tok_var: return ParseVarExpr();
1445 }
1446}
1447
1448/// unary
1449/// ::= primary
1450/// ::= '!' unary
1451static ExprAST *ParseUnary() {
1452 // If the current token is not an operator, it must be a primary expr.
1453 if (!isascii(CurTok) || CurTok == '(' || CurTok == ',')
1454 return ParsePrimary();
1455
1456 // If this is a unary operator, read it.
1457 int Opc = CurTok;
1458 getNextToken();
1459 if (ExprAST *Operand = ParseUnary())
1460 return new UnaryExprAST(Opc, Operand);
1461 return 0;
1462}
1463
1464/// binoprhs
1465/// ::= ('+' unary)*
1466static ExprAST *ParseBinOpRHS(int ExprPrec, ExprAST *LHS) {
1467 // If this is a binop, find its precedence.
1468 while (1) {
1469 int TokPrec = GetTokPrecedence();
1470
1471 // If this is a binop that binds at least as tightly as the current binop,
1472 // consume it, otherwise we are done.
1473 if (TokPrec &lt; ExprPrec)
1474 return LHS;
1475
1476 // Okay, we know this is a binop.
1477 int BinOp = CurTok;
1478 getNextToken(); // eat binop
1479
1480 // Parse the unary expression after the binary operator.
1481 ExprAST *RHS = ParseUnary();
1482 if (!RHS) return 0;
1483
1484 // If BinOp binds less tightly with RHS than the operator after RHS, let
1485 // the pending operator take RHS as its LHS.
1486 int NextPrec = GetTokPrecedence();
1487 if (TokPrec &lt; NextPrec) {
1488 RHS = ParseBinOpRHS(TokPrec+1, RHS);
1489 if (RHS == 0) return 0;
1490 }
1491
1492 // Merge LHS/RHS.
1493 LHS = new BinaryExprAST(BinOp, LHS, RHS);
1494 }
1495}
1496
1497/// expression
1498/// ::= unary binoprhs
1499///
1500static ExprAST *ParseExpression() {
1501 ExprAST *LHS = ParseUnary();
1502 if (!LHS) return 0;
1503
1504 return ParseBinOpRHS(0, LHS);
1505}
1506
1507/// prototype
1508/// ::= id '(' id* ')'
1509/// ::= binary LETTER number? (id, id)
1510/// ::= unary LETTER (id)
1511static PrototypeAST *ParsePrototype() {
1512 std::string FnName;
1513
1514 int Kind = 0; // 0 = identifier, 1 = unary, 2 = binary.
1515 unsigned BinaryPrecedence = 30;
1516
1517 switch (CurTok) {
1518 default:
1519 return ErrorP("Expected function name in prototype");
1520 case tok_identifier:
1521 FnName = IdentifierStr;
1522 Kind = 0;
1523 getNextToken();
1524 break;
1525 case tok_unary:
1526 getNextToken();
1527 if (!isascii(CurTok))
1528 return ErrorP("Expected unary operator");
1529 FnName = "unary";
1530 FnName += (char)CurTok;
1531 Kind = 1;
1532 getNextToken();
1533 break;
1534 case tok_binary:
1535 getNextToken();
1536 if (!isascii(CurTok))
1537 return ErrorP("Expected binary operator");
1538 FnName = "binary";
1539 FnName += (char)CurTok;
1540 Kind = 2;
1541 getNextToken();
1542
1543 // Read the precedence if present.
1544 if (CurTok == tok_number) {
1545 if (NumVal &lt; 1 || NumVal &gt; 100)
1546 return ErrorP("Invalid precedecnce: must be 1..100");
1547 BinaryPrecedence = (unsigned)NumVal;
1548 getNextToken();
1549 }
1550 break;
1551 }
1552
1553 if (CurTok != '(')
1554 return ErrorP("Expected '(' in prototype");
1555
1556 std::vector&lt;std::string&gt; ArgNames;
1557 while (getNextToken() == tok_identifier)
1558 ArgNames.push_back(IdentifierStr);
1559 if (CurTok != ')')
1560 return ErrorP("Expected ')' in prototype");
1561
1562 // success.
1563 getNextToken(); // eat ')'.
1564
1565 // Verify right number of names for operator.
1566 if (Kind &amp;&amp; ArgNames.size() != Kind)
1567 return ErrorP("Invalid number of operands for operator");
1568
1569 return new PrototypeAST(FnName, ArgNames, Kind != 0, BinaryPrecedence);
1570}
1571
1572/// definition ::= 'def' prototype expression
1573static FunctionAST *ParseDefinition() {
1574 getNextToken(); // eat def.
1575 PrototypeAST *Proto = ParsePrototype();
1576 if (Proto == 0) return 0;
1577
1578 if (ExprAST *E = ParseExpression())
1579 return new FunctionAST(Proto, E);
1580 return 0;
1581}
1582
1583/// toplevelexpr ::= expression
1584static FunctionAST *ParseTopLevelExpr() {
1585 if (ExprAST *E = ParseExpression()) {
1586 // Make an anonymous proto.
1587 PrototypeAST *Proto = new PrototypeAST("", std::vector&lt;std::string&gt;());
1588 return new FunctionAST(Proto, E);
1589 }
1590 return 0;
1591}
1592
1593/// external ::= 'extern' prototype
1594static PrototypeAST *ParseExtern() {
1595 getNextToken(); // eat extern.
1596 return ParsePrototype();
1597}
1598
1599//===----------------------------------------------------------------------===//
1600// Code Generation
1601//===----------------------------------------------------------------------===//
1602
1603static Module *TheModule;
1604static LLVMFoldingBuilder Builder;
1605static std::map&lt;std::string, AllocaInst*&gt; NamedValues;
1606static FunctionPassManager *TheFPM;
1607
1608Value *ErrorV(const char *Str) { Error(Str); return 0; }
1609
1610/// CreateEntryBlockAlloca - Create an alloca instruction in the entry block of
1611/// the function. This is used for mutable variables etc.
1612static AllocaInst *CreateEntryBlockAlloca(Function *TheFunction,
1613 const std::string &amp;VarName) {
1614 LLVMBuilder TmpB(&amp;TheFunction-&gt;getEntryBlock(),
1615 TheFunction-&gt;getEntryBlock().begin());
1616 return TmpB.CreateAlloca(Type::DoubleTy, 0, VarName.c_str());
1617}
1618
1619
1620Value *NumberExprAST::Codegen() {
1621 return ConstantFP::get(Type::DoubleTy, APFloat(Val));
1622}
1623
1624Value *VariableExprAST::Codegen() {
1625 // Look this variable up in the function.
1626 Value *V = NamedValues[Name];
1627 if (V == 0) return ErrorV("Unknown variable name");
1628
1629 // Load the value.
1630 return Builder.CreateLoad(V, Name.c_str());
1631}
1632
1633Value *UnaryExprAST::Codegen() {
1634 Value *OperandV = Operand-&gt;Codegen();
1635 if (OperandV == 0) return 0;
1636
1637 Function *F = TheModule-&gt;getFunction(std::string("unary")+Opcode);
1638 if (F == 0)
1639 return ErrorV("Unknown unary operator");
1640
1641 return Builder.CreateCall(F, OperandV, "unop");
1642}
1643
1644
1645Value *BinaryExprAST::Codegen() {
1646 // Special case '=' because we don't want to emit the LHS as an expression.
1647 if (Op == '=') {
1648 // Assignment requires the LHS to be an identifier.
1649 VariableExprAST *LHSE = dynamic_cast&lt;VariableExprAST*&gt;(LHS);
1650 if (!LHSE)
1651 return ErrorV("destination of '=' must be a variable");
1652 // Codegen the RHS.
1653 Value *Val = RHS-&gt;Codegen();
1654 if (Val == 0) return 0;
1655
1656 // Look up the name.
1657 Value *Variable = NamedValues[LHSE-&gt;getName()];
1658 if (Variable == 0) return ErrorV("Unknown variable name");
1659
1660 Builder.CreateStore(Val, Variable);
1661 return Val;
1662 }
1663
1664
1665 Value *L = LHS-&gt;Codegen();
1666 Value *R = RHS-&gt;Codegen();
1667 if (L == 0 || R == 0) return 0;
1668
1669 switch (Op) {
1670 case '+': return Builder.CreateAdd(L, R, "addtmp");
1671 case '-': return Builder.CreateSub(L, R, "subtmp");
1672 case '*': return Builder.CreateMul(L, R, "multmp");
1673 case '&lt;':
1674 L = Builder.CreateFCmpULT(L, R, "multmp");
1675 // Convert bool 0/1 to double 0.0 or 1.0
1676 return Builder.CreateUIToFP(L, Type::DoubleTy, "booltmp");
1677 default: break;
1678 }
1679
1680 // If it wasn't a builtin binary operator, it must be a user defined one. Emit
1681 // a call to it.
1682 Function *F = TheModule-&gt;getFunction(std::string("binary")+Op);
1683 assert(F &amp;&amp; "binary operator not found!");
1684
1685 Value *Ops[] = { L, R };
1686 return Builder.CreateCall(F, Ops, Ops+2, "binop");
1687}
1688
1689Value *CallExprAST::Codegen() {
1690 // Look up the name in the global module table.
1691 Function *CalleeF = TheModule-&gt;getFunction(Callee);
1692 if (CalleeF == 0)
1693 return ErrorV("Unknown function referenced");
1694
1695 // If argument mismatch error.
1696 if (CalleeF-&gt;arg_size() != Args.size())
1697 return ErrorV("Incorrect # arguments passed");
1698
1699 std::vector&lt;Value*&gt; ArgsV;
1700 for (unsigned i = 0, e = Args.size(); i != e; ++i) {
1701 ArgsV.push_back(Args[i]-&gt;Codegen());
1702 if (ArgsV.back() == 0) return 0;
1703 }
1704
1705 return Builder.CreateCall(CalleeF, ArgsV.begin(), ArgsV.end(), "calltmp");
1706}
1707
1708Value *IfExprAST::Codegen() {
1709 Value *CondV = Cond-&gt;Codegen();
1710 if (CondV == 0) return 0;
1711
1712 // Convert condition to a bool by comparing equal to 0.0.
1713 CondV = Builder.CreateFCmpONE(CondV,
1714 ConstantFP::get(Type::DoubleTy, APFloat(0.0)),
1715 "ifcond");
1716
1717 Function *TheFunction = Builder.GetInsertBlock()-&gt;getParent();
1718
1719 // Create blocks for the then and else cases. Insert the 'then' block at the
1720 // end of the function.
1721 BasicBlock *ThenBB = new BasicBlock("then", TheFunction);
1722 BasicBlock *ElseBB = new BasicBlock("else");
1723 BasicBlock *MergeBB = new BasicBlock("ifcont");
1724
1725 Builder.CreateCondBr(CondV, ThenBB, ElseBB);
1726
1727 // Emit then value.
1728 Builder.SetInsertPoint(ThenBB);
1729
1730 Value *ThenV = Then-&gt;Codegen();
1731 if (ThenV == 0) return 0;
1732
1733 Builder.CreateBr(MergeBB);
1734 // Codegen of 'Then' can change the current block, update ThenBB for the PHI.
1735 ThenBB = Builder.GetInsertBlock();
1736
1737 // Emit else block.
1738 TheFunction-&gt;getBasicBlockList().push_back(ElseBB);
1739 Builder.SetInsertPoint(ElseBB);
1740
1741 Value *ElseV = Else-&gt;Codegen();
1742 if (ElseV == 0) return 0;
1743
1744 Builder.CreateBr(MergeBB);
1745 // Codegen of 'Else' can change the current block, update ElseBB for the PHI.
1746 ElseBB = Builder.GetInsertBlock();
1747
1748 // Emit merge block.
1749 TheFunction-&gt;getBasicBlockList().push_back(MergeBB);
1750 Builder.SetInsertPoint(MergeBB);
1751 PHINode *PN = Builder.CreatePHI(Type::DoubleTy, "iftmp");
1752
1753 PN-&gt;addIncoming(ThenV, ThenBB);
1754 PN-&gt;addIncoming(ElseV, ElseBB);
1755 return PN;
1756}
1757
1758Value *ForExprAST::Codegen() {
1759 // Output this as:
1760 // var = alloca double
1761 // ...
1762 // start = startexpr
1763 // store start -&gt; var
1764 // goto loop
1765 // loop:
1766 // ...
1767 // bodyexpr
1768 // ...
1769 // loopend:
1770 // step = stepexpr
1771 // endcond = endexpr
1772 //
1773 // curvar = load var
1774 // nextvar = curvar + step
1775 // store nextvar -&gt; var
1776 // br endcond, loop, endloop
1777 // outloop:
1778
1779 Function *TheFunction = Builder.GetInsertBlock()-&gt;getParent();
1780
1781 // Create an alloca for the variable in the entry block.
1782 AllocaInst *Alloca = CreateEntryBlockAlloca(TheFunction, VarName);
1783
1784 // Emit the start code first, without 'variable' in scope.
1785 Value *StartVal = Start-&gt;Codegen();
1786 if (StartVal == 0) return 0;
1787
1788 // Store the value into the alloca.
1789 Builder.CreateStore(StartVal, Alloca);
1790
1791 // Make the new basic block for the loop header, inserting after current
1792 // block.
1793 BasicBlock *PreheaderBB = Builder.GetInsertBlock();
1794 BasicBlock *LoopBB = new BasicBlock("loop", TheFunction);
1795
1796 // Insert an explicit fall through from the current block to the LoopBB.
1797 Builder.CreateBr(LoopBB);
1798
1799 // Start insertion in LoopBB.
1800 Builder.SetInsertPoint(LoopBB);
1801
1802 // Within the loop, the variable is defined equal to the PHI node. If it
1803 // shadows an existing variable, we have to restore it, so save it now.
1804 AllocaInst *OldVal = NamedValues[VarName];
1805 NamedValues[VarName] = Alloca;
1806
1807 // Emit the body of the loop. This, like any other expr, can change the
1808 // current BB. Note that we ignore the value computed by the body, but don't
1809 // allow an error.
1810 if (Body-&gt;Codegen() == 0)
1811 return 0;
1812
1813 // Emit the step value.
1814 Value *StepVal;
1815 if (Step) {
1816 StepVal = Step-&gt;Codegen();
1817 if (StepVal == 0) return 0;
1818 } else {
1819 // If not specified, use 1.0.
1820 StepVal = ConstantFP::get(Type::DoubleTy, APFloat(1.0));
1821 }
1822
1823 // Compute the end condition.
1824 Value *EndCond = End-&gt;Codegen();
1825 if (EndCond == 0) return EndCond;
1826
1827 // Reload, increment, and restore the alloca. This handles the case where
1828 // the body of the loop mutates the variable.
1829 Value *CurVar = Builder.CreateLoad(Alloca, VarName.c_str());
1830 Value *NextVar = Builder.CreateAdd(CurVar, StepVal, "nextvar");
1831 Builder.CreateStore(NextVar, Alloca);
1832
1833 // Convert condition to a bool by comparing equal to 0.0.
1834 EndCond = Builder.CreateFCmpONE(EndCond,
1835 ConstantFP::get(Type::DoubleTy, APFloat(0.0)),
1836 "loopcond");
1837
1838 // Create the "after loop" block and insert it.
1839 BasicBlock *LoopEndBB = Builder.GetInsertBlock();
1840 BasicBlock *AfterBB = new BasicBlock("afterloop", TheFunction);
1841
1842 // Insert the conditional branch into the end of LoopEndBB.
1843 Builder.CreateCondBr(EndCond, LoopBB, AfterBB);
1844
1845 // Any new code will be inserted in AfterBB.
1846 Builder.SetInsertPoint(AfterBB);
1847
1848 // Restore the unshadowed variable.
1849 if (OldVal)
1850 NamedValues[VarName] = OldVal;
1851 else
1852 NamedValues.erase(VarName);
1853
1854
1855 // for expr always returns 0.0.
1856 return Constant::getNullValue(Type::DoubleTy);
1857}
1858
1859Value *VarExprAST::Codegen() {
1860 std::vector&lt;AllocaInst *&gt; OldBindings;
1861
1862 Function *TheFunction = Builder.GetInsertBlock()-&gt;getParent();
1863
1864 // Register all variables and emit their initializer.
1865 for (unsigned i = 0, e = VarNames.size(); i != e; ++i) {
1866 const std::string &amp;VarName = VarNames[i].first;
1867 ExprAST *Init = VarNames[i].second;
1868
1869 // Emit the initializer before adding the variable to scope, this prevents
1870 // the initializer from referencing the variable itself, and permits stuff
1871 // like this:
1872 // var a = 1 in
1873 // var a = a in ... # refers to outer 'a'.
1874 Value *InitVal;
1875 if (Init) {
1876 InitVal = Init-&gt;Codegen();
1877 if (InitVal == 0) return 0;
1878 } else { // If not specified, use 0.0.
1879 InitVal = ConstantFP::get(Type::DoubleTy, APFloat(0.0));
1880 }
1881
1882 AllocaInst *Alloca = CreateEntryBlockAlloca(TheFunction, VarName);
1883 Builder.CreateStore(InitVal, Alloca);
1884
1885 // Remember the old variable binding so that we can restore the binding when
1886 // we unrecurse.
1887 OldBindings.push_back(NamedValues[VarName]);
1888
1889 // Remember this binding.
1890 NamedValues[VarName] = Alloca;
1891 }
1892
1893 // Codegen the body, now that all vars are in scope.
1894 Value *BodyVal = Body-&gt;Codegen();
1895 if (BodyVal == 0) return 0;
1896
1897 // Pop all our variables from scope.
1898 for (unsigned i = 0, e = VarNames.size(); i != e; ++i)
1899 NamedValues[VarNames[i].first] = OldBindings[i];
1900
1901 // Return the body computation.
1902 return BodyVal;
1903}
1904
1905
1906Function *PrototypeAST::Codegen() {
1907 // Make the function type: double(double,double) etc.
1908 std::vector&lt;const Type*&gt; Doubles(Args.size(), Type::DoubleTy);
1909 FunctionType *FT = FunctionType::get(Type::DoubleTy, Doubles, false);
1910
1911 Function *F = new Function(FT, Function::ExternalLinkage, Name, TheModule);
1912
1913 // If F conflicted, there was already something named 'Name'. If it has a
1914 // body, don't allow redefinition or reextern.
1915 if (F-&gt;getName() != Name) {
1916 // Delete the one we just made and get the existing one.
1917 F-&gt;eraseFromParent();
1918 F = TheModule-&gt;getFunction(Name);
1919
1920 // If F already has a body, reject this.
1921 if (!F-&gt;empty()) {
1922 ErrorF("redefinition of function");
1923 return 0;
1924 }
1925
1926 // If F took a different number of args, reject.
1927 if (F-&gt;arg_size() != Args.size()) {
1928 ErrorF("redefinition of function with different # args");
1929 return 0;
1930 }
1931 }
1932
1933 // Set names for all arguments.
1934 unsigned Idx = 0;
1935 for (Function::arg_iterator AI = F-&gt;arg_begin(); Idx != Args.size();
1936 ++AI, ++Idx)
1937 AI-&gt;setName(Args[Idx]);
1938
1939 return F;
1940}
1941
1942/// CreateArgumentAllocas - Create an alloca for each argument and register the
1943/// argument in the symbol table so that references to it will succeed.
1944void PrototypeAST::CreateArgumentAllocas(Function *F) {
1945 Function::arg_iterator AI = F-&gt;arg_begin();
1946 for (unsigned Idx = 0, e = Args.size(); Idx != e; ++Idx, ++AI) {
1947 // Create an alloca for this variable.
1948 AllocaInst *Alloca = CreateEntryBlockAlloca(F, Args[Idx]);
1949
1950 // Store the initial value into the alloca.
1951 Builder.CreateStore(AI, Alloca);
1952
1953 // Add arguments to variable symbol table.
1954 NamedValues[Args[Idx]] = Alloca;
1955 }
1956}
1957
1958
1959Function *FunctionAST::Codegen() {
1960 NamedValues.clear();
1961
1962 Function *TheFunction = Proto-&gt;Codegen();
1963 if (TheFunction == 0)
1964 return 0;
1965
1966 // If this is an operator, install it.
1967 if (Proto-&gt;isBinaryOp())
1968 BinopPrecedence[Proto-&gt;getOperatorName()] = Proto-&gt;getBinaryPrecedence();
1969
1970 // Create a new basic block to start insertion into.
1971 BasicBlock *BB = new BasicBlock("entry", TheFunction);
1972 Builder.SetInsertPoint(BB);
1973
1974 // Add all arguments to the symbol table and create their allocas.
1975 Proto-&gt;CreateArgumentAllocas(TheFunction);
1976
1977 if (Value *RetVal = Body-&gt;Codegen()) {
1978 // Finish off the function.
1979 Builder.CreateRet(RetVal);
1980
1981 // Validate the generated code, checking for consistency.
1982 verifyFunction(*TheFunction);
1983
1984 // Optimize the function.
1985 TheFPM-&gt;run(*TheFunction);
1986
1987 return TheFunction;
1988 }
1989
1990 // Error reading body, remove function.
1991 TheFunction-&gt;eraseFromParent();
1992
1993 if (Proto-&gt;isBinaryOp())
1994 BinopPrecedence.erase(Proto-&gt;getOperatorName());
1995 return 0;
1996}
1997
1998//===----------------------------------------------------------------------===//
1999// Top-Level parsing and JIT Driver
2000//===----------------------------------------------------------------------===//
2001
2002static ExecutionEngine *TheExecutionEngine;
2003
2004static void HandleDefinition() {
2005 if (FunctionAST *F = ParseDefinition()) {
2006 if (Function *LF = F-&gt;Codegen()) {
2007 fprintf(stderr, "Read function definition:");
2008 LF-&gt;dump();
2009 }
2010 } else {
2011 // Skip token for error recovery.
2012 getNextToken();
2013 }
2014}
2015
2016static void HandleExtern() {
2017 if (PrototypeAST *P = ParseExtern()) {
2018 if (Function *F = P-&gt;Codegen()) {
2019 fprintf(stderr, "Read extern: ");
2020 F-&gt;dump();
2021 }
2022 } else {
2023 // Skip token for error recovery.
2024 getNextToken();
2025 }
2026}
2027
2028static void HandleTopLevelExpression() {
2029 // Evaluate a top level expression into an anonymous function.
2030 if (FunctionAST *F = ParseTopLevelExpr()) {
2031 if (Function *LF = F-&gt;Codegen()) {
2032 // JIT the function, returning a function pointer.
2033 void *FPtr = TheExecutionEngine-&gt;getPointerToFunction(LF);
2034
2035 // Cast it to the right type (takes no arguments, returns a double) so we
2036 // can call it as a native function.
2037 double (*FP)() = (double (*)())FPtr;
2038 fprintf(stderr, "Evaluated to %f\n", FP());
2039 }
2040 } else {
2041 // Skip token for error recovery.
2042 getNextToken();
2043 }
2044}
2045
2046/// top ::= definition | external | expression | ';'
2047static void MainLoop() {
2048 while (1) {
2049 fprintf(stderr, "ready&gt; ");
2050 switch (CurTok) {
2051 case tok_eof: return;
2052 case ';': getNextToken(); break; // ignore top level semicolons.
2053 case tok_def: HandleDefinition(); break;
2054 case tok_extern: HandleExtern(); break;
2055 default: HandleTopLevelExpression(); break;
2056 }
2057 }
2058}
2059
2060
2061
2062//===----------------------------------------------------------------------===//
2063// "Library" functions that can be "extern'd" from user code.
2064//===----------------------------------------------------------------------===//
2065
2066/// putchard - putchar that takes a double and returns 0.
2067extern "C"
2068double putchard(double X) {
2069 putchar((char)X);
2070 return 0;
2071}
2072
2073/// printd - printf that takes a double prints it as "%f\n", returning 0.
2074extern "C"
2075double printd(double X) {
2076 printf("%f\n", X);
2077 return 0;
2078}
2079
2080//===----------------------------------------------------------------------===//
2081// Main driver code.
2082//===----------------------------------------------------------------------===//
2083
2084int main() {
2085 // Install standard binary operators.
2086 // 1 is lowest precedence.
2087 BinopPrecedence['='] = 2;
2088 BinopPrecedence['&lt;'] = 10;
2089 BinopPrecedence['+'] = 20;
2090 BinopPrecedence['-'] = 20;
2091 BinopPrecedence['*'] = 40; // highest.
2092
2093 // Prime the first token.
2094 fprintf(stderr, "ready&gt; ");
2095 getNextToken();
2096
2097 // Make the module, which holds all the code.
2098 TheModule = new Module("my cool jit");
2099
2100 // Create the JIT.
2101 TheExecutionEngine = ExecutionEngine::create(TheModule);
2102
2103 {
2104 ExistingModuleProvider OurModuleProvider(TheModule);
2105 FunctionPassManager OurFPM(&amp;OurModuleProvider);
2106
2107 // Set up the optimizer pipeline. Start with registering info about how the
2108 // target lays out data structures.
2109 OurFPM.add(new TargetData(*TheExecutionEngine-&gt;getTargetData()));
2110 // Promote allocas to registers.
2111 OurFPM.add(createPromoteMemoryToRegisterPass());
2112 // Do simple "peephole" optimizations and bit-twiddling optzns.
2113 OurFPM.add(createInstructionCombiningPass());
2114 // Reassociate expressions.
2115 OurFPM.add(createReassociatePass());
2116 // Eliminate Common SubExpressions.
2117 OurFPM.add(createGVNPass());
2118 // Simplify the control flow graph (deleting unreachable blocks, etc).
2119 OurFPM.add(createCFGSimplificationPass());
2120
2121 // Set the global so the code gen can use this.
2122 TheFPM = &amp;OurFPM;
2123
2124 // Run the main "interpreter loop" now.
2125 MainLoop();
2126
2127 TheFPM = 0;
2128 } // Free module provider and pass manager.
2129
2130
2131 // Print out all of the generated code.
2132 TheModule-&gt;dump();
2133 return 0;
2134}
Chris Lattner00c992d2007-11-03 08:55:29 +00002135</pre>
2136</div>
2137
2138</div>
2139
2140<!-- *********************************************************************** -->
2141<hr>
2142<address>
2143 <a href="http://jigsaw.w3.org/css-validator/check/referer"><img
2144 src="http://jigsaw.w3.org/css-validator/images/vcss" alt="Valid CSS!"></a>
2145 <a href="http://validator.w3.org/check/referer"><img
2146 src="http://www.w3.org/Icons/valid-html401" alt="Valid HTML 4.01!"></a>
2147
2148 <a href="mailto:sabre@nondot.org">Chris Lattner</a><br>
2149 <a href="http://llvm.org">The LLVM Compiler Infrastructure</a><br>
2150 Last modified: $Date: 2007-10-17 11:05:13 -0700 (Wed, 17 Oct 2007) $
2151</address>
2152</body>
2153</html>