blob: a1146ed263bbba18ad1eb1ebf79055ec13683dc4 [file] [log] [blame]
Chris Lattner602c832c2007-10-31 06:30:21 +00001<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01//EN"
2 "http://www.w3.org/TR/html4/strict.dtd">
3
4<html>
5<head>
6 <title>Kaleidoscope: Extending the Language: Control Flow</title>
7 <meta http-equiv="Content-Type" content="text/html; charset=utf-8">
8 <meta name="author" content="Chris Lattner">
9 <link rel="stylesheet" href="../llvm.css" type="text/css">
10</head>
11
12<body>
13
14<div class="doc_title">Kaleidoscope: Extending the Language: Control Flow</div>
15
16<div class="doc_author">
17 <p>Written by <a href="mailto:sabre@nondot.org">Chris Lattner</a></p>
18</div>
19
20<!-- *********************************************************************** -->
21<div class="doc_section"><a name="intro">Part 5 Introduction</a></div>
22<!-- *********************************************************************** -->
23
24<div class="doc_text">
25
26<p>Welcome to Part 5 of the "<a href="index.html">Implementing a language with
27LLVM</a>" tutorial. Parts 1-4 described the implementation of the simple
28Kaleidoscope language and included support for generating LLVM IR, following by
29optimizations and a JIT compiler. Unfortunately, as presented, Kaleidoscope is
30mostly useless: it has no control flow other than call and return. This means
31that you can't have conditional branches in the code, significantly limiting its
32power. In this episode of "build that compiler", we'll extend Kaleidoscope to
33have an if/then/else expression plus a simple looping construct.</p>
34
35</div>
36
37<!-- *********************************************************************** -->
38<div class="doc_section"><a name="ifthen">If/Then/Else</a></div>
39<!-- *********************************************************************** -->
40
41<div class="doc_text">
42
43<p>
44Extending Kaleidoscope to support if/then/else is quite straight-forward. It
45basically requires adding lexer support for this "new" concept to the lexer,
46parser, AST, and LLVM code emitter. This example is nice, because it shows how
47easy it is to "grow" a language over time, incrementally extending it as new
48ideas are discovered.</p>
49
50<p>Before we get going on "how" we do this extension, lets talk about what we
51want. The basic idea is that we want to be able to write this sort of thing:
52</p>
53
54<div class="doc_code">
55<pre>
56def fib(x)
57 if x &lt; 3 then
58 1
59 else
60 fib(x-1)+fib(x-2);
61</pre>
62</div>
63
64<p>In Kaleidoscope, every construct is an expression: there are no statements.
65As such, the if/then/else expression needs to return a value like any other.
66Since we're using a mostly functional form, we'll have it evaluate its
67conditional, then return the 'then' or 'else' value based on how the condition
68was resolved. This is very similar to the C "?:" expression.</p>
69
70<p>The semantics of the if/then/else expression is that it first evaluates the
71condition to a boolean equality value: 0.0 is false and everything else is true.
72If the condition is true, the first subexpression is evaluated and returned, if
73the condition is false, the second subexpression is evaluated and returned.
74Since Kaleidoscope allows side-effects, this behavior is important to nail down.
75</p>
76
77<p>Now that we know what we want, lets break this down into its constituent
78pieces.</p>
79
80</div>
81
82<!-- ======================================================================= -->
83<div class="doc_subsubsection"><a name="iflexer">Lexer Extensions for
84If/Then/Else</a></div>
85<!-- ======================================================================= -->
86
87
88<div class="doc_text">
89
90<p>The lexer extensions are straight-forward. First we add new enum values
91for the relevant tokens:</p>
92
93<div class="doc_code">
94<pre>
95 // control
96 tok_if = -6, tok_then = -7, tok_else = -8,
97</pre>
98</div>
99
100<p>Once we have that, we recognize the new keywords in the lexer, pretty simple
101stuff:</p>
102
103<div class="doc_code">
104<pre>
105 ...
106 if (IdentifierStr == "def") return tok_def;
107 if (IdentifierStr == "extern") return tok_extern;
108 <b>if (IdentifierStr == "if") return tok_if;
109 if (IdentifierStr == "then") return tok_then;
110 if (IdentifierStr == "else") return tok_else;</b>
111 return tok_identifier;
112</pre>
113</div>
114
115</div>
116
117<!-- ======================================================================= -->
118<div class="doc_subsubsection"><a name="ifast">AST Extensions for
119 If/Then/Else </a></div>
120<!-- ======================================================================= -->
121
122<div class="doc_text">
123
124<p>To represent the new expression we add a new AST node for it:</p>
125
126<div class="doc_code">
127<pre>
128/// IfExprAST - Expression class for if/then/else.
129class IfExprAST : public ExprAST {
130 ExprAST *Cond, *Then, *Else;
131public:
132 IfExprAST(ExprAST *cond, ExprAST *then, ExprAST *_else)
133 : Cond(cond), Then(then), Else(_else) {}
134 virtual Value *Codegen();
135};
136</pre>
137</div>
138
139<p>The AST node just has pointers to the various subexpressions.</p>
140
141</div>
142
143<!-- ======================================================================= -->
144<div class="doc_subsubsection"><a name="ifparser">Parser Extensions for
145If/Then/Else </a></div>
146<!-- ======================================================================= -->
147
148<div class="doc_text">
149
150<p>Now that we have the relevant tokens coming from the lexer and we have the
151AST node to build, our parsing logic is relatively straight-forward. First we
152define a new parsing function:</p>
153
154<div class="doc_code">
155<pre>
156/// ifexpr ::= 'if' expression 'then' expression 'else' expression
157static ExprAST *ParseIfExpr() {
158 getNextToken(); // eat the if.
159
160 // condition.
161 ExprAST *Cond = ParseExpression();
162 if (!Cond) return 0;
163
164 if (CurTok != tok_then)
165 return Error("expected then");
166 getNextToken(); // eat the then
167
168 ExprAST *Then = ParseExpression();
169 if (Then == 0) return 0;
170
171 if (CurTok != tok_else)
172 return Error("expected else");
173
174 getNextToken();
175
176 ExprAST *Else = ParseExpression();
177 if (!Else) return 0;
178
179 return new IfExprAST(Cond, Then, Else);
180}
181</pre>
182</div>
183
184<p>Next we hook it up as a primary expression:</p>
185
186<div class="doc_code">
187<pre>
188static ExprAST *ParsePrimary() {
189 switch (CurTok) {
190 default: return Error("unknown token when expecting an expression");
191 case tok_identifier: return ParseIdentifierExpr();
192 case tok_number: return ParseNumberExpr();
193 case '(': return ParseParenExpr();
194 <b>case tok_if: return ParseIfExpr();</b>
195 }
196}
197</pre>
198</div>
199
200</div>
201
202<!-- ======================================================================= -->
203<div class="doc_subsubsection"><a name="ifir">LLVM IR for If/Then/Else</a></div>
204<!-- ======================================================================= -->
205
206<div class="doc_text">
207
208<p>Now that we have it parsing and building the AST, the final piece is adding
209LLVM code generation support. This is the most interesting part of the
210if/then/else example, because this is where it starts to introduce new concepts.
211All of the code above has been described in previous chapters fairly thoroughly.
212</p>
213
214<p>To motivate the code we want to produce, lets take a look at a simple
215example. Consider:</p>
216
217<div class="doc_code">
218<pre>
219extern foo();
220extern bar();
221def baz(x) if x then foo() else bar();
222</pre>
223</div>
224
225<p>If you disable optimizations, the code you'll (soon) get from Kaleidoscope
226looks like this:</p>
227
228<div class="doc_code">
229<pre>
230declare double @foo()
231
232declare double @bar()
233
234define double @baz(double %x) {
235entry:
236 %ifcond = fcmp one double %x, 0.000000e+00
237 br i1 %ifcond, label %then, label %else
238
239then: ; preds = %entry
240 %calltmp = call double @foo()
241 br label %ifcont
242
243else: ; preds = %entry
244 %calltmp1 = call double @bar()
245 br label %ifcont
246
247ifcont: ; preds = %else, %then
248 %iftmp = phi double [ %calltmp, %then ], [ %calltmp1, %else ]
249 ret double %iftmp
250}
251</pre>
252</div>
253
254<p>To visualize the control flow graph, you can use a nifty feature of the LLVM
255'<a href="http://llvm.org/cmds/opt.html">opt</a>' tool. If you put this LLVM IR
256into "t.ll" and run "<tt>llvm-as &lt; t.ll | opt -analyze -view-cfg</tt>", <a
257href="../ProgrammersManual.html#ViewGraph">a window will pop up</a> and you'll
258see this graph:</p>
259
260<center><img src="LangImpl5-cfg.png" alt="Example CFG" width="423"
261height="315"></center>
262
263<p>Another way to get this is to call "<tt>F-&gt;viewCFG()</tt>" or
264"<tt>F-&gt;viewCFGOnly()</tt>" (where F is a "<tt>Function*</tt>") either by
265inserting actual calls into the code and recompiling or by calling these in the
266debugger. LLVM has many nice features for visualizing various graphs.</p>
267
268<p>Coming back to the generated code, it is fairly simple: the entry block
269evaluates the conditional expression ("x" in our case here) and compares the
270result to 0.0 with the "<tt><a href="../LangRef.html#i_fcmp">fcmp</a> one</tt>"
271instruction ('one' is "ordered and not equal"). Based on the result of this
272expression, the code jumps to either the "then" or "else" blocks, which contain
273the expressions for the true/false case.</p>
274
275<p>Once the then/else blocks is finished executing, they both branch back to the
276else block to execute the code that happens after the if/then/else. In this
277case the only thing left to do is to return to the caller of the function. The
278question then becomes: how does the code know which expression to return?</p>
279
280<p>The answer to this question involves an important SSA operation: the
281<a href="http://en.wikipedia.org/wiki/Static_single_assignment_form">Phi
282operation</a>. If you're not familiar with SSA, <a
283href="http://en.wikipedia.org/wiki/Static_single_assignment_form">the wikipedia
284article</a> is a good introduction and there are various other introductions to
285it available on your favorite search engine. The short version is that
286"execution" of the Phi operation requires "remembering" which block control came
287from. The Phi operation takes on the value corresponding to the input control
288block. In this case, if control comes in from the "then" block, it gets the
289value of "calltmp". If control comes from the "else" block, it gets the value
290of "calltmp1".</p>
291
292<p>At this point, you are probably starting to think "on no! this means my
293simple and elegant front-end will have to start generating SSA form in order to
294use LLVM!". Fortunately, this is not the case, and we strongly advise
295<em>not</em> implementing an SSA construction algorithm in your front-end
296unless there is an amazingly good reason to do so. In practice, there are two
297sorts of values that float around in code written in your average imperative
298programming language that might need Phi nodes:</p>
299
300<ol>
301<li>Code that involves user variables: <tt>x = 1; x = x + 1; </tt></li>
302<li>Values that are implicit in the structure of your AST, such as the phi node
303in this case.</li>
304</ol>
305
Chris Lattnerb0f0deb2007-11-05 07:02:49 +0000306<p>In <a href="LangImpl7.html">Chapter 7</a> of this tutorial ("mutable
307variables"), we'll talk about #1
Chris Lattner602c832c2007-10-31 06:30:21 +0000308in depth. For now, just believe me that you don't need SSA construction to
309handle them. For #2, you have the choice of using the techniques that we will
310describe for #1, or you can insert Phi nodes directly if convenient. In this
311case, it is really really easy to generate the Phi node, so we choose to do it
312directly.</p>
313
314<p>Okay, enough of the motivation and overview, lets generate code!</p>
315
316</div>
317
318<!-- ======================================================================= -->
319<div class="doc_subsubsection"><a name="ifcodegen">Code Generation for
320If/Then/Else</a></div>
321<!-- ======================================================================= -->
322
323<div class="doc_text">
324
325<p>In order to generate code for this, we implement the <tt>Codegen</tt> method
326for <tt>IfExprAST</tt>:</p>
327
328<div class="doc_code">
329<pre>
330Value *IfExprAST::Codegen() {
331 Value *CondV = Cond-&gt;Codegen();
332 if (CondV == 0) return 0;
333
334 // Convert condition to a bool by comparing equal to 0.0.
335 CondV = Builder.CreateFCmpONE(CondV,
336 ConstantFP::get(Type::DoubleTy, APFloat(0.0)),
337 "ifcond");
338</pre>
339</div>
340
341<p>This code is straight-forward and similar to what we saw before. We emit the
342expression for the condition, then compare that value to zero to get a truth
343value as a 1-bit (bool) value.</p>
344
345<div class="doc_code">
346<pre>
347 Function *TheFunction = Builder.GetInsertBlock()-&gt;getParent();
348
349 // Create blocks for the then and else cases. Insert the 'then' block at the
350 // end of the function.
351 BasicBlock *ThenBB = new BasicBlock("then", TheFunction);
352 BasicBlock *ElseBB = new BasicBlock("else");
353 BasicBlock *MergeBB = new BasicBlock("ifcont");
354
355 Builder.CreateCondBr(CondV, ThenBB, ElseBB);
356</pre>
357</div>
358
359<p>This code creates the basic blocks that are related to the if/then/else
360statement, and correspond directly to the blocks in the example above. The
361first line of this gets the current Function object that is being built. It
362gets this by asking the builder for the current BasicBlock, and asking that
363block for its "parent" (the function it is currently embedded into).</p>
364
365<p>Once it has that, it creates three blocks. Note that it passes "TheFunction"
366into the constructor for the "then" block. This causes the constructor to
367automatically insert the new block onto the end of the specified function. The
368other two blocks are created, but aren't yet inserted into the function.</p>
369
370<p>Once the blocks are created, we can emit the conditional branch that chooses
371between them. Note that creating new blocks does not implicitly affect the
372LLVMBuilder, so it is still inserting into the block that the condition
373went into. Also note that it is creating a branch to the "then" block and the
374"else" block, even though the "else" block isn't inserted into the function yet.
375This is all ok: it is the standard way that LLVM supports forward
376references.</p>
377
378<div class="doc_code">
379<pre>
380 // Emit then value.
381 Builder.SetInsertPoint(ThenBB);
382
383 Value *ThenV = Then-&gt;Codegen();
384 if (ThenV == 0) return 0;
385
386 Builder.CreateBr(MergeBB);
387 // Codegen of 'Then' can change the current block, update ThenBB for the PHI.
388 ThenBB = Builder.GetInsertBlock();
389</pre>
390</div>
391
392<p>After the conditional branch is inserted, we move the builder to start
393inserting into the "then" block. Strictly speaking, this call moves the
394insertion point to be at the end of the specified block. However, since the
395"then" block is empty, it also starts out by inserting at the beginning of the
396block. :)</p>
397
398<p>Once the insertion point is set, we recursively codegen the "then" expression
399from the AST. To finish off the then block, we create an unconditional branch
400to the merge block. One interesting (and very important) aspect of the LLVM IR
401is that it <a href="../LangRef.html#functionstructure">requires all basic blocks
402to be "terminated"</a> with a <a href="../LangRef.html#terminators">control flow
403instruction</a> such as return or branch. This means that all control flow,
404<em>including fall throughs</em> must be made explicit in the LLVM IR. If you
405violate this rule, the verifier will emit an error.</p>
406
407<p>The final line here is quite subtle, but is very important. The basic issue
408is that when we create the Phi node in the merge block, we need to set up the
409block/value pairs that indicate how the Phi will work. Importantly, the Phi
Chris Lattnerb5019642007-11-05 17:52:04 +0000410node expects to have an entry for each predecessor of the block in the CFG. Why
Chris Lattner602c832c2007-10-31 06:30:21 +0000411then are we getting the current block when we just set it to ThenBB 5 lines
412above? The problem is that the "Then" expression may actually itself change the
413block that the Builder is emitting into if, for example, it contains a nested
414"if/then/else" expression. Because calling Codegen recursively could
415arbitrarily change the notion of the current block, we are required to get an
416up-to-date value for code that will set up the Phi node.</p>
417
418<div class="doc_code">
419<pre>
420 // Emit else block.
421 TheFunction-&gt;getBasicBlockList().push_back(ElseBB);
422 Builder.SetInsertPoint(ElseBB);
423
424 Value *ElseV = Else-&gt;Codegen();
425 if (ElseV == 0) return 0;
426
427 Builder.CreateBr(MergeBB);
428 // Codegen of 'Else' can change the current block, update ElseBB for the PHI.
429 ElseBB = Builder.GetInsertBlock();
430</pre>
431</div>
432
433<p>Code generation for the 'else' block is basically identical to codegen for
434the 'then' block. The only significant difference is the first line, which adds
435the 'else' block to the function. Recall previously that the 'else' block was
436created, but not added to the function. Now that the 'then' and 'else' blocks
437are emitted, we can finish up with the merge code:</p>
438
439<div class="doc_code">
440<pre>
441 // Emit merge block.
442 TheFunction->getBasicBlockList().push_back(MergeBB);
443 Builder.SetInsertPoint(MergeBB);
444 PHINode *PN = Builder.CreatePHI(Type::DoubleTy, "iftmp");
445
446 PN->addIncoming(ThenV, ThenBB);
447 PN->addIncoming(ElseV, ElseBB);
448 return PN;
449}
450</pre>
451</div>
452
453<p>The first two lines here are now familiar: the first adds the "merge" block
454to the Function object (it was previously floating, like the else block above).
455The second block changes the insertion point so that newly created code will go
456into the "merge" block. Once that is done, we need to create the PHI node and
457set up the block/value pairs for the PHI.</p>
458
459<p>Finally, the CodeGen function returns the phi node as the value computed by
460the if/then/else expression. In our example above, this returned value will
461feed into the code for the top-level function, which will create the return
462instruction.</p>
463
464<p>Overall, we now have the ability to execution conditional code in
465Kaleidoscope. With this extension, Kaleidoscope is a fairly complete language
466that can calculate a wide variety of numeric functions. Next up we'll add
467another useful expression that is familiar from non-functional languages...</p>
468
469</div>
470
471<!-- *********************************************************************** -->
472<div class="doc_section"><a name="for">'for' Loop Expression</a></div>
473<!-- *********************************************************************** -->
474
475<div class="doc_text">
476
Chris Lattnerf5234802007-10-31 06:47:39 +0000477<p>Now that we know how to add basic control flow constructs to the language,
478we have the tools to add more powerful things. Lets add something more
479aggressive, a 'for' expression:</p>
480
481<div class="doc_code">
482<pre>
Chris Lattnerf5234802007-10-31 06:47:39 +0000483 extern putchard(char)
Chris Lattner6093bd52007-10-31 07:29:43 +0000484 def printstar(n)
485 for i = 1, i &lt; n, 1.0 in
486 putchard(42); # ascii 42 = '*'
487
488 # print 100 '*' characters
489 printstar(100);
Chris Lattnerf5234802007-10-31 06:47:39 +0000490</pre>
491</div>
492
Chris Lattner6093bd52007-10-31 07:29:43 +0000493<p>This expression defines a new variable ("i" in this case) which iterates from
494a starting value, while the condition ("i &lt; n" in this case) is true,
Chris Lattnerf5234802007-10-31 06:47:39 +0000495incrementing by an optional step value ("1.0" in this case). If the step value
496is omitted, it defaults to 1.0. While the loop is true, it executes its
497body expression. Because we don't have anything better to return, we'll just
498define the loop as always returning 0.0. In the future when we have mutable
499variables, it will get more useful.</p>
500
501<p>As before, lets talk about the changes that we need to Kaleidoscope to
502support this.</p>
503
504</div>
505
506<!-- ======================================================================= -->
507<div class="doc_subsubsection"><a name="forlexer">Lexer Extensions for
508the 'for' Loop</a></div>
509<!-- ======================================================================= -->
510
511<div class="doc_text">
512
513<p>The lexer extensions are the same sort of thing as for if/then/else:</p>
514
515<div class="doc_code">
516<pre>
517 ... in enum Token ...
518 // control
519 tok_if = -6, tok_then = -7, tok_else = -8,
520<b> tok_for = -9, tok_in = -10</b>
521
522 ... in gettok ...
523 if (IdentifierStr == "def") return tok_def;
524 if (IdentifierStr == "extern") return tok_extern;
525 if (IdentifierStr == "if") return tok_if;
526 if (IdentifierStr == "then") return tok_then;
527 if (IdentifierStr == "else") return tok_else;
528 <b>if (IdentifierStr == "for") return tok_for;
529 if (IdentifierStr == "in") return tok_in;</b>
530 return tok_identifier;
531</pre>
532</div>
533
534</div>
535
536<!-- ======================================================================= -->
537<div class="doc_subsubsection"><a name="forast">AST Extensions for
538the 'for' Loop</a></div>
539<!-- ======================================================================= -->
540
541<div class="doc_text">
542
543<p>The AST node is similarly simple. It basically boils down to capturing
544the variable name and the consituent expressions in the node.</p>
545
546<div class="doc_code">
547<pre>
548/// ForExprAST - Expression class for for/in.
549class ForExprAST : public ExprAST {
550 std::string VarName;
551 ExprAST *Start, *End, *Step, *Body;
552public:
553 ForExprAST(const std::string &amp;varname, ExprAST *start, ExprAST *end,
554 ExprAST *step, ExprAST *body)
555 : VarName(varname), Start(start), End(end), Step(step), Body(body) {}
556 virtual Value *Codegen();
557};
558</pre>
559</div>
560
561</div>
562
563<!-- ======================================================================= -->
564<div class="doc_subsubsection"><a name="forparser">Parser Extensions for
565the 'for' Loop</a></div>
566<!-- ======================================================================= -->
567
568<div class="doc_text">
569
570<p>The parser code is also fairly standard. The only interesting thing here is
571handling of the optional step value. The parser code handles it by checking to
572see if the second comma is present. If not, it sets the step value to null in
573the AST node:</p>
574
575<div class="doc_code">
576<pre>
577/// forexpr ::= 'for' identifer '=' expr ',' expr (',' expr)? 'in' expression
578static ExprAST *ParseForExpr() {
579 getNextToken(); // eat the for.
580
581 if (CurTok != tok_identifier)
582 return Error("expected identifier after for");
583
584 std::string IdName = IdentifierStr;
585 getNextToken(); // eat identifer.
586
587 if (CurTok != '=')
588 return Error("expected '=' after for");
589 getNextToken(); // eat '='.
590
591
592 ExprAST *Start = ParseExpression();
593 if (Start == 0) return 0;
594 if (CurTok != ',')
595 return Error("expected ',' after for start value");
596 getNextToken();
597
598 ExprAST *End = ParseExpression();
599 if (End == 0) return 0;
600
601 // The step value is optional.
602 ExprAST *Step = 0;
603 if (CurTok == ',') {
604 getNextToken();
605 Step = ParseExpression();
606 if (Step == 0) return 0;
607 }
608
609 if (CurTok != tok_in)
610 return Error("expected 'in' after for");
611 getNextToken(); // eat 'in'.
612
613 ExprAST *Body = ParseExpression();
614 if (Body == 0) return 0;
615
616 return new ForExprAST(IdName, Start, End, Step, Body);
617}
618</pre>
619</div>
620
621</div>
622
623<!-- ======================================================================= -->
624<div class="doc_subsubsection"><a name="forir">LLVM IR for
625the 'for' Loop</a></div>
626<!-- ======================================================================= -->
627
628<div class="doc_text">
629
630<p>Now we get to the good part: the LLVM IR we want to generate for this thing.
Chris Lattner6093bd52007-10-31 07:29:43 +0000631With the simple example above, we get this LLVM IR (note that this dump is
632generated with optimizations disabled):
Chris Lattnerf5234802007-10-31 06:47:39 +0000633</p>
634
Chris Lattner6093bd52007-10-31 07:29:43 +0000635<div class="doc_code">
636<pre>
637declare double @putchard(double)
Chris Lattnerf5234802007-10-31 06:47:39 +0000638
Chris Lattner6093bd52007-10-31 07:29:43 +0000639define double @printstar(double %n) {
640entry:
641 ; initial value = 1.0 (inlined into phi)
642 br label %loop
643
644loop: ; preds = %loop, %entry
645 %i = phi double [ 1.000000e+00, %entry ], [ %nextvar, %loop ]
646 ; body
647 %calltmp = call double @putchard( double 4.200000e+01 )
648 ; increment
649 %nextvar = add double %i, 1.000000e+00
650
651 ; termination test
652 %multmp = fcmp ult double %i, %n
653 %booltmp = uitofp i1 %multmp to double
654 %loopcond = fcmp one double %booltmp, 0.000000e+00
655 br i1 %loopcond, label %loop, label %afterloop
656
657afterloop: ; preds = %loop
658 ; loop always returns 0.0
659 ret double 0.000000e+00
660}
661</pre>
662</div>
663
664<p>This loop contains all the same constructs we saw before: a phi node, several
665expressions, and some basic blocks. Lets see how this fits together.</p>
Chris Lattnerf5234802007-10-31 06:47:39 +0000666
667</div>
668
669<!-- ======================================================================= -->
670<div class="doc_subsubsection"><a name="forcodegen">Code Generation for
671the 'for' Loop</a></div>
672<!-- ======================================================================= -->
673
674<div class="doc_text">
675
Chris Lattner6093bd52007-10-31 07:29:43 +0000676<p>The first part of codegen is very simple: we just output the start expression
677for the loop value:</p>
Chris Lattnerf5234802007-10-31 06:47:39 +0000678
679<div class="doc_code">
680<pre>
681Value *ForExprAST::Codegen() {
Chris Lattner6093bd52007-10-31 07:29:43 +0000682 // Emit the start code first, without 'variable' in scope.
683 Value *StartVal = Start-&gt;Codegen();
684 if (StartVal == 0) return 0;
685</pre>
686</div>
687
688<p>With this out of the way, the next step is to set up the LLVM basic block
689for the start of the loop body. In the case above, the whole loop body is one
690block, but remember that the body code itself could consist of multiple blocks
691(e.g. if it is a if/then/else expression).</p>
692
693<div class="doc_code">
694<pre>
695 // Make the new basic block for the loop header, inserting after current
696 // block.
697 Function *TheFunction = Builder.GetInsertBlock()-&gt;getParent();
698 BasicBlock *PreheaderBB = Builder.GetInsertBlock();
699 BasicBlock *LoopBB = new BasicBlock("loop", TheFunction);
700
701 // Insert an explicit fall through from the current block to the LoopBB.
702 Builder.CreateBr(LoopBB);
703</pre>
704</div>
705
706<p>This code is similar to what we saw for if/then/else. Because we will need
707it to create the Phi node, we remember the block that falls through into the
708loop. Once we have that, we create the actual block that starts the loop and
709create an unconditional branch for the fall-through between the two blocks.</p>
710
711<div class="doc_code">
712<pre>
713 // Start insertion in LoopBB.
714 Builder.SetInsertPoint(LoopBB);
715
716 // Start the PHI node with an entry for Start.
717 PHINode *Variable = Builder.CreatePHI(Type::DoubleTy, VarName.c_str());
718 Variable-&gt;addIncoming(StartVal, PreheaderBB);
719</pre>
720</div>
721
722<p>Now that the "preheader" for the loop is set up, we switch to emitting code
723for the loop body. To begin with, we move the insertion point and create the
724PHI node for the loop induction variable. SInce we already know the incoming
725value for the starting value, we add it to the Phi node. Note that the Phi will
726eventually get a second value for the backedge, but we can't set it up yet
727(because it doesn't exist!).</p>
728
729<div class="doc_code">
730<pre>
731 // Within the loop, the variable is defined equal to the PHI node. If it
732 // shadows an existing variable, we have to restore it, so save it now.
733 Value *OldVal = NamedValues[VarName];
734 NamedValues[VarName] = Variable;
735
736 // Emit the body of the loop. This, like any other expr, can change the
737 // current BB. Note that we ignore the value computed by the body, but don't
738 // allow an error.
739 if (Body-&gt;Codegen() == 0)
740 return 0;
741</pre>
742</div>
743
744<p>Now the code starts to get more interesting. Our 'for' loop introduces a new
745variable to the symbol table. This means that our symbol table can now contain
746either function arguments or loop variables. To handle this, before we codegen
747the body of the loop, we add the loop variable as the current value for its
748name. Note that it is possible that there is a variable of the same name in the
749outer scope. It would be easy to make this an error (emit an error and return
750null if there is already an entry for VarName) but we choose to allow shadowing
751of variables. In order to handle this correctly, we remember the Value that
752we are potentially shadowing in <tt>OldVal</tt> (which will be null if there is
753no shadowed variable).</p>
754
755<p>Once the loop variable is set into the symbol table, the code recursively
756codegen's the body. This allows the body to use the loop variable: any
757references to it will naturally find it in the symbol table.</p>
758
759<div class="doc_code">
760<pre>
761 // Emit the step value.
762 Value *StepVal;
763 if (Step) {
764 StepVal = Step-&gt;Codegen();
765 if (StepVal == 0) return 0;
766 } else {
767 // If not specified, use 1.0.
768 StepVal = ConstantFP::get(Type::DoubleTy, APFloat(1.0));
769 }
770
771 Value *NextVar = Builder.CreateAdd(Variable, StepVal, "nextvar");
772</pre>
773</div>
774
775<p>Now that the body is emitted, we compute the next value of the iteration
776variable by adding the step value or 1.0 if it isn't present. '<tt>NextVar</tt>'
777will be the value of the loop variable on the next iteration of the loop.</p>
778
779<div class="doc_code">
780<pre>
781 // Compute the end condition.
782 Value *EndCond = End-&gt;Codegen();
783 if (EndCond == 0) return EndCond;
784
785 // Convert condition to a bool by comparing equal to 0.0.
786 EndCond = Builder.CreateFCmpONE(EndCond,
787 ConstantFP::get(Type::DoubleTy, APFloat(0.0)),
788 "loopcond");
789</pre>
790</div>
791
792<p>Finally, we evaluate the exit value of the loop, to determine whether the
793loop should exit. This mirrors the condition evaluation for the if/then/else
794statement.</p>
795
796<div class="doc_code">
797<pre>
798 // Create the "after loop" block and insert it.
799 BasicBlock *LoopEndBB = Builder.GetInsertBlock();
800 BasicBlock *AfterBB = new BasicBlock("afterloop", TheFunction);
801
802 // Insert the conditional branch into the end of LoopEndBB.
803 Builder.CreateCondBr(EndCond, LoopBB, AfterBB);
804
805 // Any new code will be inserted in AfterBB.
806 Builder.SetInsertPoint(AfterBB);
807</pre>
808</div>
809
810<p>With the code for the body of the loop complete, we just need to finish up
811the control flow for it. This remembers the end block (for the phi node), then
812creates the block for the loop exit ("afterloop"). Based on the value of the
813exit condition, it creates a conditional branch that chooses between executing
814the loop again and exiting the loop. Any future code is emitted in the
815"afterloop" block, so it sets the insertion position to it.</p>
816
817<div class="doc_code">
818<pre>
819 // Add a new entry to the PHI node for the backedge.
820 Variable-&gt;addIncoming(NextVar, LoopEndBB);
821
822 // Restore the unshadowed variable.
823 if (OldVal)
824 NamedValues[VarName] = OldVal;
825 else
826 NamedValues.erase(VarName);
827
828 // for expr always returns 0.0.
829 return Constant::getNullValue(Type::DoubleTy);
830}
831</pre>
832</div>
833
834<p>The final code handles various cleanups: now that we have the "NextVar"
835value, we can add the incoming value to the loop PHI node. After that, we
836remove the loop variable from the symbol table, so that it isn't in scope after
837the for loop. Finally, code generation of the for loop always returns 0.0, so
838that is what we return from <tt>ForExprAST::Codegen</tt>.</p>
839
840<p>With this, we conclude the "adding control flow to Kaleidoscope" chapter of
841the tutorial. We added two control flow constructs, and used them to motivate
842a couple of aspects of the LLVM IR that are important for front-end implementors
843to know. In the next chapter of our saga, we will get a bit crazier and add
844operator overloading to our poor innocent language.</p>
845
846</div>
847
848<!-- *********************************************************************** -->
849<div class="doc_section"><a name="code">Full Code Listing</a></div>
850<!-- *********************************************************************** -->
851
852<div class="doc_text">
853
854<p>
855Here is the complete code listing for our running example, enhanced with the
856if/then/else and for expressions.. To build this example, use:
857</p>
858
859<div class="doc_code">
860<pre>
861 # Compile
862 g++ -g toy.cpp `llvm-config --cppflags --ldflags --libs core jit native` -O3 -o toy
863 # Run
864 ./toy
865</pre>
866</div>
867
868<p>Here is the code:</p>
869
870<div class="doc_code">
871<pre>
872#include "llvm/DerivedTypes.h"
873#include "llvm/ExecutionEngine/ExecutionEngine.h"
874#include "llvm/Module.h"
875#include "llvm/ModuleProvider.h"
876#include "llvm/PassManager.h"
877#include "llvm/Analysis/Verifier.h"
878#include "llvm/Target/TargetData.h"
879#include "llvm/Transforms/Scalar.h"
880#include "llvm/Support/LLVMBuilder.h"
881#include &lt;cstdio&gt;
882#include &lt;string&gt;
883#include &lt;map&gt;
884#include &lt;vector&gt;
885using namespace llvm;
886
887//===----------------------------------------------------------------------===//
888// Lexer
889//===----------------------------------------------------------------------===//
890
891// The lexer returns tokens [0-255] if it is an unknown character, otherwise one
892// of these for known things.
893enum Token {
894 tok_eof = -1,
895
896 // commands
897 tok_def = -2, tok_extern = -3,
898
899 // primary
900 tok_identifier = -4, tok_number = -5,
901
902 // control
903 tok_if = -6, tok_then = -7, tok_else = -8,
904 tok_for = -9, tok_in = -10
905};
906
907static std::string IdentifierStr; // Filled in if tok_identifier
908static double NumVal; // Filled in if tok_number
909
910/// gettok - Return the next token from standard input.
911static int gettok() {
912 static int LastChar = ' ';
913
914 // Skip any whitespace.
915 while (isspace(LastChar))
916 LastChar = getchar();
917
918 if (isalpha(LastChar)) { // identifier: [a-zA-Z][a-zA-Z0-9]*
919 IdentifierStr = LastChar;
920 while (isalnum((LastChar = getchar())))
921 IdentifierStr += LastChar;
922
923 if (IdentifierStr == "def") return tok_def;
924 if (IdentifierStr == "extern") return tok_extern;
925 if (IdentifierStr == "if") return tok_if;
926 if (IdentifierStr == "then") return tok_then;
927 if (IdentifierStr == "else") return tok_else;
928 if (IdentifierStr == "for") return tok_for;
929 if (IdentifierStr == "in") return tok_in;
930 return tok_identifier;
931 }
932
933 if (isdigit(LastChar) || LastChar == '.') { // Number: [0-9.]+
934 std::string NumStr;
935 do {
936 NumStr += LastChar;
937 LastChar = getchar();
938 } while (isdigit(LastChar) || LastChar == '.');
939
940 NumVal = strtod(NumStr.c_str(), 0);
941 return tok_number;
942 }
943
944 if (LastChar == '#') {
945 // Comment until end of line.
946 do LastChar = getchar();
947 while (LastChar != EOF &amp;&amp; LastChar != '\n' &amp; LastChar != '\r');
948
949 if (LastChar != EOF)
950 return gettok();
951 }
952
953 // Check for end of file. Don't eat the EOF.
954 if (LastChar == EOF)
955 return tok_eof;
956
957 // Otherwise, just return the character as its ascii value.
958 int ThisChar = LastChar;
959 LastChar = getchar();
960 return ThisChar;
961}
962
963//===----------------------------------------------------------------------===//
964// Abstract Syntax Tree (aka Parse Tree)
965//===----------------------------------------------------------------------===//
966
967/// ExprAST - Base class for all expression nodes.
968class ExprAST {
969public:
970 virtual ~ExprAST() {}
971 virtual Value *Codegen() = 0;
972};
973
974/// NumberExprAST - Expression class for numeric literals like "1.0".
975class NumberExprAST : public ExprAST {
976 double Val;
977public:
978 NumberExprAST(double val) : Val(val) {}
979 virtual Value *Codegen();
980};
981
982/// VariableExprAST - Expression class for referencing a variable, like "a".
983class VariableExprAST : public ExprAST {
984 std::string Name;
985public:
986 VariableExprAST(const std::string &amp;name) : Name(name) {}
987 virtual Value *Codegen();
988};
989
990/// BinaryExprAST - Expression class for a binary operator.
991class BinaryExprAST : public ExprAST {
992 char Op;
993 ExprAST *LHS, *RHS;
994public:
995 BinaryExprAST(char op, ExprAST *lhs, ExprAST *rhs)
996 : Op(op), LHS(lhs), RHS(rhs) {}
997 virtual Value *Codegen();
998};
999
1000/// CallExprAST - Expression class for function calls.
1001class CallExprAST : public ExprAST {
1002 std::string Callee;
1003 std::vector&lt;ExprAST*&gt; Args;
1004public:
1005 CallExprAST(const std::string &amp;callee, std::vector&lt;ExprAST*&gt; &amp;args)
1006 : Callee(callee), Args(args) {}
1007 virtual Value *Codegen();
1008};
1009
1010/// IfExprAST - Expression class for if/then/else.
1011class IfExprAST : public ExprAST {
1012 ExprAST *Cond, *Then, *Else;
1013public:
1014 IfExprAST(ExprAST *cond, ExprAST *then, ExprAST *_else)
1015 : Cond(cond), Then(then), Else(_else) {}
1016 virtual Value *Codegen();
1017};
1018
1019/// ForExprAST - Expression class for for/in.
1020class ForExprAST : public ExprAST {
1021 std::string VarName;
1022 ExprAST *Start, *End, *Step, *Body;
1023public:
1024 ForExprAST(const std::string &amp;varname, ExprAST *start, ExprAST *end,
1025 ExprAST *step, ExprAST *body)
1026 : VarName(varname), Start(start), End(end), Step(step), Body(body) {}
1027 virtual Value *Codegen();
1028};
1029
1030/// PrototypeAST - This class represents the "prototype" for a function,
1031/// which captures its argument names as well as if it is an operator.
1032class PrototypeAST {
1033 std::string Name;
1034 std::vector&lt;std::string&gt; Args;
1035public:
1036 PrototypeAST(const std::string &amp;name, const std::vector&lt;std::string&gt; &amp;args)
1037 : Name(name), Args(args) {}
1038
1039 Function *Codegen();
1040};
1041
1042/// FunctionAST - This class represents a function definition itself.
1043class FunctionAST {
1044 PrototypeAST *Proto;
1045 ExprAST *Body;
1046public:
1047 FunctionAST(PrototypeAST *proto, ExprAST *body)
1048 : Proto(proto), Body(body) {}
1049
1050 Function *Codegen();
1051};
1052
1053//===----------------------------------------------------------------------===//
1054// Parser
1055//===----------------------------------------------------------------------===//
1056
1057/// CurTok/getNextToken - Provide a simple token buffer. CurTok is the current
1058/// token the parser it looking at. getNextToken reads another token from the
1059/// lexer and updates CurTok with its results.
1060static int CurTok;
1061static int getNextToken() {
1062 return CurTok = gettok();
1063}
1064
1065/// BinopPrecedence - This holds the precedence for each binary operator that is
1066/// defined.
1067static std::map&lt;char, int&gt; BinopPrecedence;
1068
1069/// GetTokPrecedence - Get the precedence of the pending binary operator token.
1070static int GetTokPrecedence() {
1071 if (!isascii(CurTok))
1072 return -1;
1073
1074 // Make sure it's a declared binop.
1075 int TokPrec = BinopPrecedence[CurTok];
1076 if (TokPrec &lt;= 0) return -1;
1077 return TokPrec;
1078}
1079
1080/// Error* - These are little helper functions for error handling.
1081ExprAST *Error(const char *Str) { fprintf(stderr, "Error: %s\n", Str);return 0;}
1082PrototypeAST *ErrorP(const char *Str) { Error(Str); return 0; }
1083FunctionAST *ErrorF(const char *Str) { Error(Str); return 0; }
1084
1085static ExprAST *ParseExpression();
1086
1087/// identifierexpr
1088/// ::= identifer
1089/// ::= identifer '(' expression* ')'
1090static ExprAST *ParseIdentifierExpr() {
1091 std::string IdName = IdentifierStr;
1092
1093 getNextToken(); // eat identifer.
1094
1095 if (CurTok != '(') // Simple variable ref.
1096 return new VariableExprAST(IdName);
1097
1098 // Call.
1099 getNextToken(); // eat (
1100 std::vector&lt;ExprAST*&gt; Args;
1101 if (CurTok != ')') {
1102 while (1) {
1103 ExprAST *Arg = ParseExpression();
1104 if (!Arg) return 0;
1105 Args.push_back(Arg);
1106
1107 if (CurTok == ')') break;
1108
1109 if (CurTok != ',')
1110 return Error("Expected ')'");
1111 getNextToken();
1112 }
1113 }
1114
1115 // Eat the ')'.
1116 getNextToken();
1117
1118 return new CallExprAST(IdName, Args);
1119}
1120
1121/// numberexpr ::= number
1122static ExprAST *ParseNumberExpr() {
1123 ExprAST *Result = new NumberExprAST(NumVal);
1124 getNextToken(); // consume the number
1125 return Result;
1126}
1127
1128/// parenexpr ::= '(' expression ')'
1129static ExprAST *ParseParenExpr() {
1130 getNextToken(); // eat (.
1131 ExprAST *V = ParseExpression();
1132 if (!V) return 0;
1133
1134 if (CurTok != ')')
1135 return Error("expected ')'");
1136 getNextToken(); // eat ).
1137 return V;
1138}
1139
1140/// ifexpr ::= 'if' expression 'then' expression 'else' expression
1141static ExprAST *ParseIfExpr() {
1142 getNextToken(); // eat the if.
1143
1144 // condition.
1145 ExprAST *Cond = ParseExpression();
1146 if (!Cond) return 0;
1147
1148 if (CurTok != tok_then)
1149 return Error("expected then");
1150 getNextToken(); // eat the then
1151
1152 ExprAST *Then = ParseExpression();
1153 if (Then == 0) return 0;
1154
1155 if (CurTok != tok_else)
1156 return Error("expected else");
1157
1158 getNextToken();
1159
1160 ExprAST *Else = ParseExpression();
1161 if (!Else) return 0;
1162
1163 return new IfExprAST(Cond, Then, Else);
1164}
1165
1166/// forexpr ::= 'for' identifer '=' expr ',' expr (',' expr)? 'in' expression
1167static ExprAST *ParseForExpr() {
1168 getNextToken(); // eat the for.
1169
1170 if (CurTok != tok_identifier)
1171 return Error("expected identifier after for");
1172
1173 std::string IdName = IdentifierStr;
1174 getNextToken(); // eat identifer.
1175
1176 if (CurTok != '=')
1177 return Error("expected '=' after for");
1178 getNextToken(); // eat '='.
1179
1180
1181 ExprAST *Start = ParseExpression();
1182 if (Start == 0) return 0;
1183 if (CurTok != ',')
1184 return Error("expected ',' after for start value");
1185 getNextToken();
1186
1187 ExprAST *End = ParseExpression();
1188 if (End == 0) return 0;
1189
1190 // The step value is optional.
1191 ExprAST *Step = 0;
1192 if (CurTok == ',') {
1193 getNextToken();
1194 Step = ParseExpression();
1195 if (Step == 0) return 0;
1196 }
1197
1198 if (CurTok != tok_in)
1199 return Error("expected 'in' after for");
1200 getNextToken(); // eat 'in'.
1201
1202 ExprAST *Body = ParseExpression();
1203 if (Body == 0) return 0;
1204
1205 return new ForExprAST(IdName, Start, End, Step, Body);
1206}
1207
1208
1209/// primary
1210/// ::= identifierexpr
1211/// ::= numberexpr
1212/// ::= parenexpr
1213/// ::= ifexpr
1214/// ::= forexpr
1215static ExprAST *ParsePrimary() {
1216 switch (CurTok) {
1217 default: return Error("unknown token when expecting an expression");
1218 case tok_identifier: return ParseIdentifierExpr();
1219 case tok_number: return ParseNumberExpr();
1220 case '(': return ParseParenExpr();
1221 case tok_if: return ParseIfExpr();
1222 case tok_for: return ParseForExpr();
1223 }
1224}
1225
1226/// binoprhs
1227/// ::= ('+' primary)*
1228static ExprAST *ParseBinOpRHS(int ExprPrec, ExprAST *LHS) {
1229 // If this is a binop, find its precedence.
1230 while (1) {
1231 int TokPrec = GetTokPrecedence();
1232
1233 // If this is a binop that binds at least as tightly as the current binop,
1234 // consume it, otherwise we are done.
1235 if (TokPrec &lt; ExprPrec)
1236 return LHS;
1237
1238 // Okay, we know this is a binop.
1239 int BinOp = CurTok;
1240 getNextToken(); // eat binop
1241
1242 // Parse the primary expression after the binary operator.
1243 ExprAST *RHS = ParsePrimary();
1244 if (!RHS) return 0;
1245
1246 // If BinOp binds less tightly with RHS than the operator after RHS, let
1247 // the pending operator take RHS as its LHS.
1248 int NextPrec = GetTokPrecedence();
1249 if (TokPrec &lt; NextPrec) {
1250 RHS = ParseBinOpRHS(TokPrec+1, RHS);
1251 if (RHS == 0) return 0;
1252 }
1253
1254 // Merge LHS/RHS.
1255 LHS = new BinaryExprAST(BinOp, LHS, RHS);
1256 }
1257}
1258
1259/// expression
1260/// ::= primary binoprhs
1261///
1262static ExprAST *ParseExpression() {
1263 ExprAST *LHS = ParsePrimary();
1264 if (!LHS) return 0;
1265
1266 return ParseBinOpRHS(0, LHS);
1267}
1268
1269/// prototype
1270/// ::= id '(' id* ')'
1271static PrototypeAST *ParsePrototype() {
1272 if (CurTok != tok_identifier)
1273 return ErrorP("Expected function name in prototype");
1274
1275 std::string FnName = IdentifierStr;
1276 getNextToken();
1277
1278 if (CurTok != '(')
1279 return ErrorP("Expected '(' in prototype");
1280
1281 std::vector&lt;std::string&gt; ArgNames;
1282 while (getNextToken() == tok_identifier)
1283 ArgNames.push_back(IdentifierStr);
1284 if (CurTok != ')')
1285 return ErrorP("Expected ')' in prototype");
1286
1287 // success.
1288 getNextToken(); // eat ')'.
1289
1290 return new PrototypeAST(FnName, ArgNames);
1291}
1292
1293/// definition ::= 'def' prototype expression
1294static FunctionAST *ParseDefinition() {
1295 getNextToken(); // eat def.
1296 PrototypeAST *Proto = ParsePrototype();
1297 if (Proto == 0) return 0;
1298
1299 if (ExprAST *E = ParseExpression())
1300 return new FunctionAST(Proto, E);
1301 return 0;
1302}
1303
1304/// toplevelexpr ::= expression
1305static FunctionAST *ParseTopLevelExpr() {
1306 if (ExprAST *E = ParseExpression()) {
1307 // Make an anonymous proto.
1308 PrototypeAST *Proto = new PrototypeAST("", std::vector&lt;std::string&gt;());
1309 return new FunctionAST(Proto, E);
1310 }
1311 return 0;
1312}
1313
1314/// external ::= 'extern' prototype
1315static PrototypeAST *ParseExtern() {
1316 getNextToken(); // eat extern.
1317 return ParsePrototype();
1318}
1319
1320//===----------------------------------------------------------------------===//
1321// Code Generation
1322//===----------------------------------------------------------------------===//
1323
1324static Module *TheModule;
1325static LLVMFoldingBuilder Builder;
1326static std::map&lt;std::string, Value*&gt; NamedValues;
1327static FunctionPassManager *TheFPM;
1328
1329Value *ErrorV(const char *Str) { Error(Str); return 0; }
1330
1331Value *NumberExprAST::Codegen() {
1332 return ConstantFP::get(Type::DoubleTy, APFloat(Val));
1333}
1334
1335Value *VariableExprAST::Codegen() {
1336 // Look this variable up in the function.
1337 Value *V = NamedValues[Name];
1338 return V ? V : ErrorV("Unknown variable name");
1339}
1340
1341Value *BinaryExprAST::Codegen() {
1342 Value *L = LHS-&gt;Codegen();
1343 Value *R = RHS-&gt;Codegen();
1344 if (L == 0 || R == 0) return 0;
1345
1346 switch (Op) {
1347 case '+': return Builder.CreateAdd(L, R, "addtmp");
1348 case '-': return Builder.CreateSub(L, R, "subtmp");
1349 case '*': return Builder.CreateMul(L, R, "multmp");
1350 case '&lt;':
1351 L = Builder.CreateFCmpULT(L, R, "multmp");
1352 // Convert bool 0/1 to double 0.0 or 1.0
1353 return Builder.CreateUIToFP(L, Type::DoubleTy, "booltmp");
1354 default: return ErrorV("invalid binary operator");
1355 }
1356}
1357
1358Value *CallExprAST::Codegen() {
1359 // Look up the name in the global module table.
1360 Function *CalleeF = TheModule-&gt;getFunction(Callee);
1361 if (CalleeF == 0)
1362 return ErrorV("Unknown function referenced");
1363
1364 // If argument mismatch error.
1365 if (CalleeF-&gt;arg_size() != Args.size())
1366 return ErrorV("Incorrect # arguments passed");
1367
1368 std::vector&lt;Value*&gt; ArgsV;
1369 for (unsigned i = 0, e = Args.size(); i != e; ++i) {
1370 ArgsV.push_back(Args[i]-&gt;Codegen());
1371 if (ArgsV.back() == 0) return 0;
1372 }
1373
1374 return Builder.CreateCall(CalleeF, ArgsV.begin(), ArgsV.end(), "calltmp");
1375}
1376
1377Value *IfExprAST::Codegen() {
1378 Value *CondV = Cond-&gt;Codegen();
1379 if (CondV == 0) return 0;
1380
1381 // Convert condition to a bool by comparing equal to 0.0.
1382 CondV = Builder.CreateFCmpONE(CondV,
1383 ConstantFP::get(Type::DoubleTy, APFloat(0.0)),
1384 "ifcond");
1385
1386 Function *TheFunction = Builder.GetInsertBlock()-&gt;getParent();
1387
1388 // Create blocks for the then and else cases. Insert the 'then' block at the
1389 // end of the function.
1390 BasicBlock *ThenBB = new BasicBlock("then", TheFunction);
1391 BasicBlock *ElseBB = new BasicBlock("else");
1392 BasicBlock *MergeBB = new BasicBlock("ifcont");
1393
1394 Builder.CreateCondBr(CondV, ThenBB, ElseBB);
1395
1396 // Emit then value.
1397 Builder.SetInsertPoint(ThenBB);
1398
1399 Value *ThenV = Then-&gt;Codegen();
1400 if (ThenV == 0) return 0;
1401
1402 Builder.CreateBr(MergeBB);
1403 // Codegen of 'Then' can change the current block, update ThenBB for the PHI.
1404 ThenBB = Builder.GetInsertBlock();
1405
1406 // Emit else block.
1407 TheFunction-&gt;getBasicBlockList().push_back(ElseBB);
1408 Builder.SetInsertPoint(ElseBB);
1409
1410 Value *ElseV = Else-&gt;Codegen();
1411 if (ElseV == 0) return 0;
1412
1413 Builder.CreateBr(MergeBB);
1414 // Codegen of 'Else' can change the current block, update ElseBB for the PHI.
1415 ElseBB = Builder.GetInsertBlock();
1416
1417 // Emit merge block.
1418 TheFunction-&gt;getBasicBlockList().push_back(MergeBB);
1419 Builder.SetInsertPoint(MergeBB);
1420 PHINode *PN = Builder.CreatePHI(Type::DoubleTy, "iftmp");
1421
1422 PN-&gt;addIncoming(ThenV, ThenBB);
1423 PN-&gt;addIncoming(ElseV, ElseBB);
1424 return PN;
1425}
1426
1427Value *ForExprAST::Codegen() {
Chris Lattnerf5234802007-10-31 06:47:39 +00001428 // Output this as:
1429 // ...
1430 // start = startexpr
1431 // goto loop
1432 // loop:
1433 // variable = phi [start, loopheader], [nextvariable, loopend]
1434 // ...
1435 // bodyexpr
1436 // ...
1437 // loopend:
1438 // step = stepexpr
1439 // nextvariable = variable + step
1440 // endcond = endexpr
1441 // br endcond, loop, endloop
1442 // outloop:
1443
1444 // Emit the start code first, without 'variable' in scope.
1445 Value *StartVal = Start-&gt;Codegen();
1446 if (StartVal == 0) return 0;
1447
1448 // Make the new basic block for the loop header, inserting after current
1449 // block.
1450 Function *TheFunction = Builder.GetInsertBlock()-&gt;getParent();
1451 BasicBlock *PreheaderBB = Builder.GetInsertBlock();
1452 BasicBlock *LoopBB = new BasicBlock("loop", TheFunction);
1453
1454 // Insert an explicit fall through from the current block to the LoopBB.
Chris Lattnerf5234802007-10-31 06:47:39 +00001455 Builder.CreateBr(LoopBB);
Chris Lattner6093bd52007-10-31 07:29:43 +00001456
1457 // Start insertion in LoopBB.
Chris Lattnerf5234802007-10-31 06:47:39 +00001458 Builder.SetInsertPoint(LoopBB);
1459
1460 // Start the PHI node with an entry for Start.
1461 PHINode *Variable = Builder.CreatePHI(Type::DoubleTy, VarName.c_str());
1462 Variable-&gt;addIncoming(StartVal, PreheaderBB);
1463
1464 // Within the loop, the variable is defined equal to the PHI node. If it
1465 // shadows an existing variable, we have to restore it, so save it now.
1466 Value *OldVal = NamedValues[VarName];
1467 NamedValues[VarName] = Variable;
1468
1469 // Emit the body of the loop. This, like any other expr, can change the
1470 // current BB. Note that we ignore the value computed by the body, but don't
1471 // allow an error.
1472 if (Body-&gt;Codegen() == 0)
1473 return 0;
1474
1475 // Emit the step value.
1476 Value *StepVal;
1477 if (Step) {
1478 StepVal = Step-&gt;Codegen();
1479 if (StepVal == 0) return 0;
1480 } else {
1481 // If not specified, use 1.0.
1482 StepVal = ConstantFP::get(Type::DoubleTy, APFloat(1.0));
1483 }
1484
1485 Value *NextVar = Builder.CreateAdd(Variable, StepVal, "nextvar");
1486
Chris Lattnerf5234802007-10-31 06:47:39 +00001487 // Compute the end condition.
1488 Value *EndCond = End-&gt;Codegen();
1489 if (EndCond == 0) return EndCond;
1490
1491 // Convert condition to a bool by comparing equal to 0.0.
1492 EndCond = Builder.CreateFCmpONE(EndCond,
1493 ConstantFP::get(Type::DoubleTy, APFloat(0.0)),
1494 "loopcond");
1495
1496 // Create the "after loop" block and insert it.
1497 BasicBlock *LoopEndBB = Builder.GetInsertBlock();
1498 BasicBlock *AfterBB = new BasicBlock("afterloop", TheFunction);
1499
1500 // Insert the conditional branch into the end of LoopEndBB.
1501 Builder.CreateCondBr(EndCond, LoopBB, AfterBB);
1502
1503 // Any new code will be inserted in AfterBB.
1504 Builder.SetInsertPoint(AfterBB);
1505
1506 // Add a new entry to the PHI node for the backedge.
1507 Variable-&gt;addIncoming(NextVar, LoopEndBB);
1508
1509 // Restore the unshadowed variable.
1510 if (OldVal)
1511 NamedValues[VarName] = OldVal;
1512 else
1513 NamedValues.erase(VarName);
1514
1515
1516 // for expr always returns 0.0.
1517 return Constant::getNullValue(Type::DoubleTy);
1518}
Chris Lattner6093bd52007-10-31 07:29:43 +00001519
1520Function *PrototypeAST::Codegen() {
1521 // Make the function type: double(double,double) etc.
1522 std::vector&lt;const Type*&gt; Doubles(Args.size(), Type::DoubleTy);
1523 FunctionType *FT = FunctionType::get(Type::DoubleTy, Doubles, false);
1524
1525 Function *F = new Function(FT, Function::ExternalLinkage, Name, TheModule);
1526
1527 // If F conflicted, there was already something named 'Name'. If it has a
1528 // body, don't allow redefinition or reextern.
1529 if (F-&gt;getName() != Name) {
1530 // Delete the one we just made and get the existing one.
1531 F-&gt;eraseFromParent();
1532 F = TheModule-&gt;getFunction(Name);
1533
1534 // If F already has a body, reject this.
1535 if (!F-&gt;empty()) {
1536 ErrorF("redefinition of function");
1537 return 0;
1538 }
1539
1540 // If F took a different number of args, reject.
1541 if (F-&gt;arg_size() != Args.size()) {
1542 ErrorF("redefinition of function with different # args");
1543 return 0;
1544 }
1545 }
1546
1547 // Set names for all arguments.
1548 unsigned Idx = 0;
1549 for (Function::arg_iterator AI = F-&gt;arg_begin(); Idx != Args.size();
1550 ++AI, ++Idx) {
1551 AI-&gt;setName(Args[Idx]);
1552
1553 // Add arguments to variable symbol table.
1554 NamedValues[Args[Idx]] = AI;
1555 }
1556
1557 return F;
1558}
1559
1560Function *FunctionAST::Codegen() {
1561 NamedValues.clear();
1562
1563 Function *TheFunction = Proto-&gt;Codegen();
1564 if (TheFunction == 0)
1565 return 0;
1566
1567 // Create a new basic block to start insertion into.
1568 BasicBlock *BB = new BasicBlock("entry", TheFunction);
1569 Builder.SetInsertPoint(BB);
1570
1571 if (Value *RetVal = Body-&gt;Codegen()) {
1572 // Finish off the function.
1573 Builder.CreateRet(RetVal);
1574
1575 // Validate the generated code, checking for consistency.
1576 verifyFunction(*TheFunction);
1577
1578 // Optimize the function.
1579 TheFPM-&gt;run(*TheFunction);
1580
1581 return TheFunction;
1582 }
1583
1584 // Error reading body, remove function.
1585 TheFunction-&gt;eraseFromParent();
1586 return 0;
1587}
1588
1589//===----------------------------------------------------------------------===//
1590// Top-Level parsing and JIT Driver
1591//===----------------------------------------------------------------------===//
1592
1593static ExecutionEngine *TheExecutionEngine;
1594
1595static void HandleDefinition() {
1596 if (FunctionAST *F = ParseDefinition()) {
1597 if (Function *LF = F-&gt;Codegen()) {
1598 fprintf(stderr, "Read function definition:");
1599 LF-&gt;dump();
1600 }
1601 } else {
1602 // Skip token for error recovery.
1603 getNextToken();
1604 }
1605}
1606
1607static void HandleExtern() {
1608 if (PrototypeAST *P = ParseExtern()) {
1609 if (Function *F = P-&gt;Codegen()) {
1610 fprintf(stderr, "Read extern: ");
1611 F-&gt;dump();
1612 }
1613 } else {
1614 // Skip token for error recovery.
1615 getNextToken();
1616 }
1617}
1618
1619static void HandleTopLevelExpression() {
1620 // Evaluate a top level expression into an anonymous function.
1621 if (FunctionAST *F = ParseTopLevelExpr()) {
1622 if (Function *LF = F-&gt;Codegen()) {
1623 // JIT the function, returning a function pointer.
1624 void *FPtr = TheExecutionEngine-&gt;getPointerToFunction(LF);
1625
1626 // Cast it to the right type (takes no arguments, returns a double) so we
1627 // can call it as a native function.
1628 double (*FP)() = (double (*)())FPtr;
1629 fprintf(stderr, "Evaluated to %f\n", FP());
1630 }
1631 } else {
1632 // Skip token for error recovery.
1633 getNextToken();
1634 }
1635}
1636
1637/// top ::= definition | external | expression | ';'
1638static void MainLoop() {
1639 while (1) {
1640 fprintf(stderr, "ready&gt; ");
1641 switch (CurTok) {
1642 case tok_eof: return;
1643 case ';': getNextToken(); break; // ignore top level semicolons.
1644 case tok_def: HandleDefinition(); break;
1645 case tok_extern: HandleExtern(); break;
1646 default: HandleTopLevelExpression(); break;
1647 }
1648 }
1649}
Chris Lattnerf5234802007-10-31 06:47:39 +00001650
Chris Lattner602c832c2007-10-31 06:30:21 +00001651
Chris Lattner602c832c2007-10-31 06:30:21 +00001652
Chris Lattner6093bd52007-10-31 07:29:43 +00001653//===----------------------------------------------------------------------===//
1654// "Library" functions that can be "extern'd" from user code.
1655//===----------------------------------------------------------------------===//
Chris Lattner602c832c2007-10-31 06:30:21 +00001656
Chris Lattner6093bd52007-10-31 07:29:43 +00001657/// putchard - putchar that takes a double and returns 0.
1658extern "C"
1659double putchard(double X) {
1660 putchar((char)X);
1661 return 0;
1662}
Chris Lattner602c832c2007-10-31 06:30:21 +00001663
Chris Lattner6093bd52007-10-31 07:29:43 +00001664//===----------------------------------------------------------------------===//
1665// Main driver code.
1666//===----------------------------------------------------------------------===//
Chris Lattner602c832c2007-10-31 06:30:21 +00001667
Chris Lattner6093bd52007-10-31 07:29:43 +00001668int main() {
1669 // Install standard binary operators.
1670 // 1 is lowest precedence.
1671 BinopPrecedence['&lt;'] = 10;
1672 BinopPrecedence['+'] = 20;
1673 BinopPrecedence['-'] = 20;
1674 BinopPrecedence['*'] = 40; // highest.
Chris Lattner602c832c2007-10-31 06:30:21 +00001675
Chris Lattner6093bd52007-10-31 07:29:43 +00001676 // Prime the first token.
1677 fprintf(stderr, "ready&gt; ");
1678 getNextToken();
Chris Lattner602c832c2007-10-31 06:30:21 +00001679
Chris Lattner6093bd52007-10-31 07:29:43 +00001680 // Make the module, which holds all the code.
1681 TheModule = new Module("my cool jit");
1682
1683 // Create the JIT.
1684 TheExecutionEngine = ExecutionEngine::create(TheModule);
1685
1686 {
1687 ExistingModuleProvider OurModuleProvider(TheModule);
1688 FunctionPassManager OurFPM(&amp;OurModuleProvider);
1689
1690 // Set up the optimizer pipeline. Start with registering info about how the
1691 // target lays out data structures.
1692 OurFPM.add(new TargetData(*TheExecutionEngine-&gt;getTargetData()));
1693 // Do simple "peephole" optimizations and bit-twiddling optzns.
1694 OurFPM.add(createInstructionCombiningPass());
1695 // Reassociate expressions.
1696 OurFPM.add(createReassociatePass());
1697 // Eliminate Common SubExpressions.
1698 OurFPM.add(createGVNPass());
1699 // Simplify the control flow graph (deleting unreachable blocks, etc).
1700 OurFPM.add(createCFGSimplificationPass());
1701 // Set the global so the code gen can use this.
1702 TheFPM = &amp;OurFPM;
1703
1704 // Run the main "interpreter loop" now.
1705 MainLoop();
1706
1707 TheFPM = 0;
1708 } // Free module provider and pass manager.
1709
1710
1711 // Print out all of the generated code.
1712 TheModule-&gt;dump();
1713 return 0;
1714}
Chris Lattner602c832c2007-10-31 06:30:21 +00001715</pre>
1716</div>
1717
1718</div>
1719
1720<!-- *********************************************************************** -->
1721<hr>
1722<address>
1723 <a href="http://jigsaw.w3.org/css-validator/check/referer"><img
1724 src="http://jigsaw.w3.org/css-validator/images/vcss" alt="Valid CSS!"></a>
1725 <a href="http://validator.w3.org/check/referer"><img
1726 src="http://www.w3.org/Icons/valid-html401" alt="Valid HTML 4.01!"></a>
1727
1728 <a href="mailto:sabre@nondot.org">Chris Lattner</a><br>
1729 <a href="http://llvm.org">The LLVM Compiler Infrastructure</a><br>
1730 Last modified: $Date: 2007-10-17 11:05:13 -0700 (Wed, 17 Oct 2007) $
1731</address>
1732</body>
1733</html>