blob: 4fbf7beca42bc9f023b32ddc8e964a04270bc880 [file] [log] [blame]
Chris Lattner602c832c2007-10-31 06:30:21 +00001<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01//EN"
2 "http://www.w3.org/TR/html4/strict.dtd">
3
4<html>
5<head>
6 <title>Kaleidoscope: Extending the Language: Control Flow</title>
7 <meta http-equiv="Content-Type" content="text/html; charset=utf-8">
8 <meta name="author" content="Chris Lattner">
9 <link rel="stylesheet" href="../llvm.css" type="text/css">
10</head>
11
12<body>
13
NAKAMURA Takumi05d02652011-04-18 23:59:50 +000014<h1>Kaleidoscope: Extending the Language: Control Flow</h1>
Chris Lattner602c832c2007-10-31 06:30:21 +000015
Chris Lattner128eb862007-11-05 19:06:59 +000016<ul>
Chris Lattner0e555b12007-11-05 20:04:56 +000017<li><a href="index.html">Up to Tutorial Index</a></li>
Chris Lattner128eb862007-11-05 19:06:59 +000018<li>Chapter 5
19 <ol>
20 <li><a href="#intro">Chapter 5 Introduction</a></li>
21 <li><a href="#ifthen">If/Then/Else</a>
22 <ol>
23 <li><a href="#iflexer">Lexer Extensions</a></li>
24 <li><a href="#ifast">AST Extensions</a></li>
25 <li><a href="#ifparser">Parser Extensions</a></li>
26 <li><a href="#ifir">LLVM IR</a></li>
27 <li><a href="#ifcodegen">Code Generation</a></li>
28 </ol>
29 </li>
30 <li><a href="#for">'for' Loop Expression</a>
31 <ol>
32 <li><a href="#forlexer">Lexer Extensions</a></li>
33 <li><a href="#forast">AST Extensions</a></li>
34 <li><a href="#forparser">Parser Extensions</a></li>
35 <li><a href="#forir">LLVM IR</a></li>
36 <li><a href="#forcodegen">Code Generation</a></li>
37 </ol>
38 </li>
39 <li><a href="#code">Full Code Listing</a></li>
40 </ol>
41</li>
Chris Lattner0e555b12007-11-05 20:04:56 +000042<li><a href="LangImpl6.html">Chapter 6</a>: Extending the Language:
43User-defined Operators</li>
Chris Lattner128eb862007-11-05 19:06:59 +000044</ul>
45
Chris Lattner602c832c2007-10-31 06:30:21 +000046<div class="doc_author">
47 <p>Written by <a href="mailto:sabre@nondot.org">Chris Lattner</a></p>
48</div>
49
50<!-- *********************************************************************** -->
NAKAMURA Takumi05d02652011-04-18 23:59:50 +000051<h2><a name="intro">Chapter 5 Introduction</a></h2>
Chris Lattner602c832c2007-10-31 06:30:21 +000052<!-- *********************************************************************** -->
53
54<div class="doc_text">
55
Chris Lattner128eb862007-11-05 19:06:59 +000056<p>Welcome to Chapter 5 of the "<a href="index.html">Implementing a language
57with LLVM</a>" tutorial. Parts 1-4 described the implementation of the simple
Chris Lattner41fcea32007-11-13 07:06:30 +000058Kaleidoscope language and included support for generating LLVM IR, followed by
Chris Lattner602c832c2007-10-31 06:30:21 +000059optimizations and a JIT compiler. Unfortunately, as presented, Kaleidoscope is
60mostly useless: it has no control flow other than call and return. This means
61that you can't have conditional branches in the code, significantly limiting its
62power. In this episode of "build that compiler", we'll extend Kaleidoscope to
Chris Lattner1092a962007-11-07 05:47:48 +000063have an if/then/else expression plus a simple 'for' loop.</p>
Chris Lattner602c832c2007-10-31 06:30:21 +000064
65</div>
66
67<!-- *********************************************************************** -->
NAKAMURA Takumi05d02652011-04-18 23:59:50 +000068<h2><a name="ifthen">If/Then/Else</a></h2>
Chris Lattner602c832c2007-10-31 06:30:21 +000069<!-- *********************************************************************** -->
70
71<div class="doc_text">
72
73<p>
Chris Lattner41fcea32007-11-13 07:06:30 +000074Extending Kaleidoscope to support if/then/else is quite straightforward. It
Duncan Sands60c8bf52011-02-27 13:54:01 +000075basically requires adding support for this "new" concept to the lexer,
Chris Lattner602c832c2007-10-31 06:30:21 +000076parser, AST, and LLVM code emitter. This example is nice, because it shows how
77easy it is to "grow" a language over time, incrementally extending it as new
78ideas are discovered.</p>
79
Chris Lattner41fcea32007-11-13 07:06:30 +000080<p>Before we get going on "how" we add this extension, lets talk about "what" we
Chris Lattner602c832c2007-10-31 06:30:21 +000081want. The basic idea is that we want to be able to write this sort of thing:
82</p>
83
84<div class="doc_code">
85<pre>
86def fib(x)
87 if x &lt; 3 then
88 1
89 else
90 fib(x-1)+fib(x-2);
91</pre>
92</div>
93
94<p>In Kaleidoscope, every construct is an expression: there are no statements.
95As such, the if/then/else expression needs to return a value like any other.
96Since we're using a mostly functional form, we'll have it evaluate its
97conditional, then return the 'then' or 'else' value based on how the condition
98was resolved. This is very similar to the C "?:" expression.</p>
99
Chris Lattner41fcea32007-11-13 07:06:30 +0000100<p>The semantics of the if/then/else expression is that it evaluates the
Chris Lattner1092a962007-11-07 05:47:48 +0000101condition to a boolean equality value: 0.0 is considered to be false and
102everything else is considered to be true.
Chris Lattner602c832c2007-10-31 06:30:21 +0000103If the condition is true, the first subexpression is evaluated and returned, if
104the condition is false, the second subexpression is evaluated and returned.
105Since Kaleidoscope allows side-effects, this behavior is important to nail down.
106</p>
107
Chris Lattner41fcea32007-11-13 07:06:30 +0000108<p>Now that we know what we "want", lets break this down into its constituent
Chris Lattner602c832c2007-10-31 06:30:21 +0000109pieces.</p>
110
111</div>
112
113<!-- ======================================================================= -->
NAKAMURA Takumi05d02652011-04-18 23:59:50 +0000114<h4><a name="iflexer">Lexer Extensions for If/Then/Else</a></h4>
Chris Lattner602c832c2007-10-31 06:30:21 +0000115<!-- ======================================================================= -->
116
117
118<div class="doc_text">
119
Chris Lattner41fcea32007-11-13 07:06:30 +0000120<p>The lexer extensions are straightforward. First we add new enum values
Chris Lattner602c832c2007-10-31 06:30:21 +0000121for the relevant tokens:</p>
122
123<div class="doc_code">
124<pre>
125 // control
126 tok_if = -6, tok_then = -7, tok_else = -8,
127</pre>
128</div>
129
Chris Lattner41fcea32007-11-13 07:06:30 +0000130<p>Once we have that, we recognize the new keywords in the lexer. This is pretty simple
Chris Lattner602c832c2007-10-31 06:30:21 +0000131stuff:</p>
132
133<div class="doc_code">
134<pre>
135 ...
136 if (IdentifierStr == "def") return tok_def;
137 if (IdentifierStr == "extern") return tok_extern;
138 <b>if (IdentifierStr == "if") return tok_if;
139 if (IdentifierStr == "then") return tok_then;
140 if (IdentifierStr == "else") return tok_else;</b>
141 return tok_identifier;
142</pre>
143</div>
144
145</div>
146
147<!-- ======================================================================= -->
NAKAMURA Takumi05d02652011-04-18 23:59:50 +0000148<h4><a name="ifast">AST Extensions for If/Then/Else</a></h4>
Chris Lattner602c832c2007-10-31 06:30:21 +0000149<!-- ======================================================================= -->
150
151<div class="doc_text">
152
153<p>To represent the new expression we add a new AST node for it:</p>
154
155<div class="doc_code">
156<pre>
157/// IfExprAST - Expression class for if/then/else.
158class IfExprAST : public ExprAST {
159 ExprAST *Cond, *Then, *Else;
160public:
161 IfExprAST(ExprAST *cond, ExprAST *then, ExprAST *_else)
162 : Cond(cond), Then(then), Else(_else) {}
163 virtual Value *Codegen();
164};
165</pre>
166</div>
167
168<p>The AST node just has pointers to the various subexpressions.</p>
169
170</div>
171
172<!-- ======================================================================= -->
NAKAMURA Takumi05d02652011-04-18 23:59:50 +0000173<h4><a name="ifparser">Parser Extensions for If/Then/Else</a></h4>
Chris Lattner602c832c2007-10-31 06:30:21 +0000174<!-- ======================================================================= -->
175
176<div class="doc_text">
177
178<p>Now that we have the relevant tokens coming from the lexer and we have the
Chris Lattner41fcea32007-11-13 07:06:30 +0000179AST node to build, our parsing logic is relatively straightforward. First we
Chris Lattner602c832c2007-10-31 06:30:21 +0000180define a new parsing function:</p>
181
182<div class="doc_code">
183<pre>
184/// ifexpr ::= 'if' expression 'then' expression 'else' expression
185static ExprAST *ParseIfExpr() {
186 getNextToken(); // eat the if.
187
188 // condition.
189 ExprAST *Cond = ParseExpression();
190 if (!Cond) return 0;
191
192 if (CurTok != tok_then)
193 return Error("expected then");
194 getNextToken(); // eat the then
195
196 ExprAST *Then = ParseExpression();
197 if (Then == 0) return 0;
198
199 if (CurTok != tok_else)
200 return Error("expected else");
201
202 getNextToken();
203
204 ExprAST *Else = ParseExpression();
205 if (!Else) return 0;
206
207 return new IfExprAST(Cond, Then, Else);
208}
209</pre>
210</div>
211
212<p>Next we hook it up as a primary expression:</p>
213
214<div class="doc_code">
215<pre>
216static ExprAST *ParsePrimary() {
217 switch (CurTok) {
218 default: return Error("unknown token when expecting an expression");
219 case tok_identifier: return ParseIdentifierExpr();
220 case tok_number: return ParseNumberExpr();
221 case '(': return ParseParenExpr();
222 <b>case tok_if: return ParseIfExpr();</b>
223 }
224}
225</pre>
226</div>
227
228</div>
229
230<!-- ======================================================================= -->
NAKAMURA Takumi05d02652011-04-18 23:59:50 +0000231<h4><a name="ifir">LLVM IR for If/Then/Else</a></h4>
Chris Lattner602c832c2007-10-31 06:30:21 +0000232<!-- ======================================================================= -->
233
234<div class="doc_text">
235
236<p>Now that we have it parsing and building the AST, the final piece is adding
237LLVM code generation support. This is the most interesting part of the
238if/then/else example, because this is where it starts to introduce new concepts.
Chris Lattner1092a962007-11-07 05:47:48 +0000239All of the code above has been thoroughly described in previous chapters.
Chris Lattner602c832c2007-10-31 06:30:21 +0000240</p>
241
242<p>To motivate the code we want to produce, lets take a look at a simple
243example. Consider:</p>
244
245<div class="doc_code">
246<pre>
247extern foo();
248extern bar();
249def baz(x) if x then foo() else bar();
250</pre>
251</div>
252
253<p>If you disable optimizations, the code you'll (soon) get from Kaleidoscope
254looks like this:</p>
255
256<div class="doc_code">
257<pre>
258declare double @foo()
259
260declare double @bar()
261
262define double @baz(double %x) {
263entry:
264 %ifcond = fcmp one double %x, 0.000000e+00
265 br i1 %ifcond, label %then, label %else
266
267then: ; preds = %entry
268 %calltmp = call double @foo()
269 br label %ifcont
270
271else: ; preds = %entry
272 %calltmp1 = call double @bar()
273 br label %ifcont
274
275ifcont: ; preds = %else, %then
276 %iftmp = phi double [ %calltmp, %then ], [ %calltmp1, %else ]
277 ret double %iftmp
278}
279</pre>
280</div>
281
282<p>To visualize the control flow graph, you can use a nifty feature of the LLVM
283'<a href="http://llvm.org/cmds/opt.html">opt</a>' tool. If you put this LLVM IR
284into "t.ll" and run "<tt>llvm-as &lt; t.ll | opt -analyze -view-cfg</tt>", <a
285href="../ProgrammersManual.html#ViewGraph">a window will pop up</a> and you'll
286see this graph:</p>
287
Benjamin Kramere15192b2009-08-05 15:42:44 +0000288<div style="text-align: center"><img src="LangImpl5-cfg.png" alt="Example CFG" width="423"
289height="315"></div>
Chris Lattner602c832c2007-10-31 06:30:21 +0000290
291<p>Another way to get this is to call "<tt>F-&gt;viewCFG()</tt>" or
292"<tt>F-&gt;viewCFGOnly()</tt>" (where F is a "<tt>Function*</tt>") either by
293inserting actual calls into the code and recompiling or by calling these in the
294debugger. LLVM has many nice features for visualizing various graphs.</p>
295
Chris Lattner41fcea32007-11-13 07:06:30 +0000296<p>Getting back to the generated code, it is fairly simple: the entry block
Chris Lattner602c832c2007-10-31 06:30:21 +0000297evaluates the conditional expression ("x" in our case here) and compares the
298result to 0.0 with the "<tt><a href="../LangRef.html#i_fcmp">fcmp</a> one</tt>"
Chris Lattner1092a962007-11-07 05:47:48 +0000299instruction ('one' is "Ordered and Not Equal"). Based on the result of this
Chris Lattner602c832c2007-10-31 06:30:21 +0000300expression, the code jumps to either the "then" or "else" blocks, which contain
Chris Lattner1092a962007-11-07 05:47:48 +0000301the expressions for the true/false cases.</p>
Chris Lattner602c832c2007-10-31 06:30:21 +0000302
Chris Lattner41fcea32007-11-13 07:06:30 +0000303<p>Once the then/else blocks are finished executing, they both branch back to the
Chris Lattner1092a962007-11-07 05:47:48 +0000304'ifcont' block to execute the code that happens after the if/then/else. In this
Chris Lattner602c832c2007-10-31 06:30:21 +0000305case the only thing left to do is to return to the caller of the function. The
306question then becomes: how does the code know which expression to return?</p>
307
308<p>The answer to this question involves an important SSA operation: the
309<a href="http://en.wikipedia.org/wiki/Static_single_assignment_form">Phi
310operation</a>. If you're not familiar with SSA, <a
311href="http://en.wikipedia.org/wiki/Static_single_assignment_form">the wikipedia
312article</a> is a good introduction and there are various other introductions to
313it available on your favorite search engine. The short version is that
314"execution" of the Phi operation requires "remembering" which block control came
315from. The Phi operation takes on the value corresponding to the input control
316block. In this case, if control comes in from the "then" block, it gets the
317value of "calltmp". If control comes from the "else" block, it gets the value
318of "calltmp1".</p>
319
Chris Lattner41fcea32007-11-13 07:06:30 +0000320<p>At this point, you are probably starting to think "Oh no! This means my
Chris Lattner602c832c2007-10-31 06:30:21 +0000321simple and elegant front-end will have to start generating SSA form in order to
322use LLVM!". Fortunately, this is not the case, and we strongly advise
323<em>not</em> implementing an SSA construction algorithm in your front-end
324unless there is an amazingly good reason to do so. In practice, there are two
Chris Lattner41fcea32007-11-13 07:06:30 +0000325sorts of values that float around in code written for your average imperative
Chris Lattner602c832c2007-10-31 06:30:21 +0000326programming language that might need Phi nodes:</p>
327
328<ol>
329<li>Code that involves user variables: <tt>x = 1; x = x + 1; </tt></li>
Chris Lattner41fcea32007-11-13 07:06:30 +0000330<li>Values that are implicit in the structure of your AST, such as the Phi node
Chris Lattner602c832c2007-10-31 06:30:21 +0000331in this case.</li>
332</ol>
333
Chris Lattnerb0f0deb2007-11-05 07:02:49 +0000334<p>In <a href="LangImpl7.html">Chapter 7</a> of this tutorial ("mutable
335variables"), we'll talk about #1
Chris Lattner602c832c2007-10-31 06:30:21 +0000336in depth. For now, just believe me that you don't need SSA construction to
Chris Lattner41fcea32007-11-13 07:06:30 +0000337handle this case. For #2, you have the choice of using the techniques that we will
338describe for #1, or you can insert Phi nodes directly, if convenient. In this
Chris Lattner602c832c2007-10-31 06:30:21 +0000339case, it is really really easy to generate the Phi node, so we choose to do it
340directly.</p>
341
342<p>Okay, enough of the motivation and overview, lets generate code!</p>
343
344</div>
345
346<!-- ======================================================================= -->
NAKAMURA Takumi05d02652011-04-18 23:59:50 +0000347<h4><a name="ifcodegen">Code Generation for If/Then/Else</a></h4>
Chris Lattner602c832c2007-10-31 06:30:21 +0000348<!-- ======================================================================= -->
349
350<div class="doc_text">
351
352<p>In order to generate code for this, we implement the <tt>Codegen</tt> method
353for <tt>IfExprAST</tt>:</p>
354
355<div class="doc_code">
356<pre>
357Value *IfExprAST::Codegen() {
358 Value *CondV = Cond-&gt;Codegen();
359 if (CondV == 0) return 0;
360
361 // Convert condition to a bool by comparing equal to 0.0.
362 CondV = Builder.CreateFCmpONE(CondV,
Owen Anderson6f83c9c2009-07-27 20:59:43 +0000363 ConstantFP::get(getGlobalContext(), APFloat(0.0)),
Chris Lattner602c832c2007-10-31 06:30:21 +0000364 "ifcond");
365</pre>
366</div>
367
Chris Lattner41fcea32007-11-13 07:06:30 +0000368<p>This code is straightforward and similar to what we saw before. We emit the
Chris Lattner602c832c2007-10-31 06:30:21 +0000369expression for the condition, then compare that value to zero to get a truth
370value as a 1-bit (bool) value.</p>
371
372<div class="doc_code">
373<pre>
374 Function *TheFunction = Builder.GetInsertBlock()-&gt;getParent();
375
376 // Create blocks for the then and else cases. Insert the 'then' block at the
377 // end of the function.
Owen Anderson1d0be152009-08-13 21:58:54 +0000378 BasicBlock *ThenBB = BasicBlock::Create(getGlobalContext(), "then", TheFunction);
379 BasicBlock *ElseBB = BasicBlock::Create(getGlobalContext(), "else");
380 BasicBlock *MergeBB = BasicBlock::Create(getGlobalContext(), "ifcont");
Chris Lattner602c832c2007-10-31 06:30:21 +0000381
382 Builder.CreateCondBr(CondV, ThenBB, ElseBB);
383</pre>
384</div>
385
386<p>This code creates the basic blocks that are related to the if/then/else
387statement, and correspond directly to the blocks in the example above. The
Chris Lattner1092a962007-11-07 05:47:48 +0000388first line gets the current Function object that is being built. It
Chris Lattner602c832c2007-10-31 06:30:21 +0000389gets this by asking the builder for the current BasicBlock, and asking that
390block for its "parent" (the function it is currently embedded into).</p>
391
392<p>Once it has that, it creates three blocks. Note that it passes "TheFunction"
393into the constructor for the "then" block. This causes the constructor to
Chris Lattner41fcea32007-11-13 07:06:30 +0000394automatically insert the new block into the end of the specified function. The
Chris Lattner602c832c2007-10-31 06:30:21 +0000395other two blocks are created, but aren't yet inserted into the function.</p>
396
397<p>Once the blocks are created, we can emit the conditional branch that chooses
398between them. Note that creating new blocks does not implicitly affect the
Duncan Sands89f6d882008-04-13 06:22:09 +0000399IRBuilder, so it is still inserting into the block that the condition
Chris Lattner602c832c2007-10-31 06:30:21 +0000400went into. Also note that it is creating a branch to the "then" block and the
401"else" block, even though the "else" block isn't inserted into the function yet.
402This is all ok: it is the standard way that LLVM supports forward
403references.</p>
404
405<div class="doc_code">
406<pre>
407 // Emit then value.
408 Builder.SetInsertPoint(ThenBB);
409
410 Value *ThenV = Then-&gt;Codegen();
411 if (ThenV == 0) return 0;
412
413 Builder.CreateBr(MergeBB);
414 // Codegen of 'Then' can change the current block, update ThenBB for the PHI.
415 ThenBB = Builder.GetInsertBlock();
416</pre>
417</div>
418
419<p>After the conditional branch is inserted, we move the builder to start
420inserting into the "then" block. Strictly speaking, this call moves the
421insertion point to be at the end of the specified block. However, since the
422"then" block is empty, it also starts out by inserting at the beginning of the
423block. :)</p>
424
425<p>Once the insertion point is set, we recursively codegen the "then" expression
Chris Lattner41fcea32007-11-13 07:06:30 +0000426from the AST. To finish off the "then" block, we create an unconditional branch
Chris Lattner602c832c2007-10-31 06:30:21 +0000427to the merge block. One interesting (and very important) aspect of the LLVM IR
428is that it <a href="../LangRef.html#functionstructure">requires all basic blocks
429to be "terminated"</a> with a <a href="../LangRef.html#terminators">control flow
430instruction</a> such as return or branch. This means that all control flow,
431<em>including fall throughs</em> must be made explicit in the LLVM IR. If you
432violate this rule, the verifier will emit an error.</p>
433
434<p>The final line here is quite subtle, but is very important. The basic issue
435is that when we create the Phi node in the merge block, we need to set up the
436block/value pairs that indicate how the Phi will work. Importantly, the Phi
Chris Lattnerb5019642007-11-05 17:52:04 +0000437node expects to have an entry for each predecessor of the block in the CFG. Why
Chris Lattner41fcea32007-11-13 07:06:30 +0000438then, are we getting the current block when we just set it to ThenBB 5 lines
Chris Lattner602c832c2007-10-31 06:30:21 +0000439above? The problem is that the "Then" expression may actually itself change the
440block that the Builder is emitting into if, for example, it contains a nested
441"if/then/else" expression. Because calling Codegen recursively could
442arbitrarily change the notion of the current block, we are required to get an
443up-to-date value for code that will set up the Phi node.</p>
444
445<div class="doc_code">
446<pre>
447 // Emit else block.
448 TheFunction-&gt;getBasicBlockList().push_back(ElseBB);
449 Builder.SetInsertPoint(ElseBB);
450
451 Value *ElseV = Else-&gt;Codegen();
452 if (ElseV == 0) return 0;
453
454 Builder.CreateBr(MergeBB);
455 // Codegen of 'Else' can change the current block, update ElseBB for the PHI.
456 ElseBB = Builder.GetInsertBlock();
457</pre>
458</div>
459
460<p>Code generation for the 'else' block is basically identical to codegen for
461the 'then' block. The only significant difference is the first line, which adds
462the 'else' block to the function. Recall previously that the 'else' block was
463created, but not added to the function. Now that the 'then' and 'else' blocks
464are emitted, we can finish up with the merge code:</p>
465
466<div class="doc_code">
467<pre>
468 // Emit merge block.
469 TheFunction->getBasicBlockList().push_back(MergeBB);
470 Builder.SetInsertPoint(MergeBB);
Jay Foad3ecfc862011-03-30 11:28:46 +0000471 PHINode *PN = Builder.CreatePHI(Type::getDoubleTy(getGlobalContext()), 2,
Erick Tryzelaarfd1ec5e2009-09-22 21:14:49 +0000472 "iftmp");
Chris Lattner602c832c2007-10-31 06:30:21 +0000473
474 PN->addIncoming(ThenV, ThenBB);
475 PN->addIncoming(ElseV, ElseBB);
476 return PN;
477}
478</pre>
479</div>
480
481<p>The first two lines here are now familiar: the first adds the "merge" block
482to the Function object (it was previously floating, like the else block above).
483The second block changes the insertion point so that newly created code will go
484into the "merge" block. Once that is done, we need to create the PHI node and
485set up the block/value pairs for the PHI.</p>
486
487<p>Finally, the CodeGen function returns the phi node as the value computed by
488the if/then/else expression. In our example above, this returned value will
489feed into the code for the top-level function, which will create the return
490instruction.</p>
491
Chris Lattner41fcea32007-11-13 07:06:30 +0000492<p>Overall, we now have the ability to execute conditional code in
Chris Lattner602c832c2007-10-31 06:30:21 +0000493Kaleidoscope. With this extension, Kaleidoscope is a fairly complete language
494that can calculate a wide variety of numeric functions. Next up we'll add
495another useful expression that is familiar from non-functional languages...</p>
496
497</div>
498
499<!-- *********************************************************************** -->
NAKAMURA Takumi05d02652011-04-18 23:59:50 +0000500<h2><a name="for">'for' Loop Expression</a></h2>
Chris Lattner602c832c2007-10-31 06:30:21 +0000501<!-- *********************************************************************** -->
502
503<div class="doc_text">
504
Chris Lattnerf5234802007-10-31 06:47:39 +0000505<p>Now that we know how to add basic control flow constructs to the language,
506we have the tools to add more powerful things. Lets add something more
507aggressive, a 'for' expression:</p>
508
509<div class="doc_code">
510<pre>
Chris Lattnerf5234802007-10-31 06:47:39 +0000511 extern putchard(char)
Chris Lattner6093bd52007-10-31 07:29:43 +0000512 def printstar(n)
513 for i = 1, i &lt; n, 1.0 in
514 putchard(42); # ascii 42 = '*'
515
516 # print 100 '*' characters
517 printstar(100);
Chris Lattnerf5234802007-10-31 06:47:39 +0000518</pre>
519</div>
520
Chris Lattner6093bd52007-10-31 07:29:43 +0000521<p>This expression defines a new variable ("i" in this case) which iterates from
522a starting value, while the condition ("i &lt; n" in this case) is true,
Chris Lattnerf5234802007-10-31 06:47:39 +0000523incrementing by an optional step value ("1.0" in this case). If the step value
524is omitted, it defaults to 1.0. While the loop is true, it executes its
525body expression. Because we don't have anything better to return, we'll just
526define the loop as always returning 0.0. In the future when we have mutable
527variables, it will get more useful.</p>
528
529<p>As before, lets talk about the changes that we need to Kaleidoscope to
530support this.</p>
531
532</div>
533
534<!-- ======================================================================= -->
NAKAMURA Takumi05d02652011-04-18 23:59:50 +0000535<h4><a name="forlexer">Lexer Extensions for the 'for' Loop</a></h4>
Chris Lattnerf5234802007-10-31 06:47:39 +0000536<!-- ======================================================================= -->
537
538<div class="doc_text">
539
540<p>The lexer extensions are the same sort of thing as for if/then/else:</p>
541
542<div class="doc_code">
543<pre>
544 ... in enum Token ...
545 // control
546 tok_if = -6, tok_then = -7, tok_else = -8,
547<b> tok_for = -9, tok_in = -10</b>
548
549 ... in gettok ...
550 if (IdentifierStr == "def") return tok_def;
551 if (IdentifierStr == "extern") return tok_extern;
552 if (IdentifierStr == "if") return tok_if;
553 if (IdentifierStr == "then") return tok_then;
554 if (IdentifierStr == "else") return tok_else;
555 <b>if (IdentifierStr == "for") return tok_for;
556 if (IdentifierStr == "in") return tok_in;</b>
557 return tok_identifier;
558</pre>
559</div>
560
561</div>
562
563<!-- ======================================================================= -->
NAKAMURA Takumi05d02652011-04-18 23:59:50 +0000564<h4><a name="forast">AST Extensions for the 'for' Loop</a></h4>
Chris Lattnerf5234802007-10-31 06:47:39 +0000565<!-- ======================================================================= -->
566
567<div class="doc_text">
568
Chris Lattner41fcea32007-11-13 07:06:30 +0000569<p>The AST node is just as simple. It basically boils down to capturing
Chris Lattner1092a962007-11-07 05:47:48 +0000570the variable name and the constituent expressions in the node.</p>
Chris Lattnerf5234802007-10-31 06:47:39 +0000571
572<div class="doc_code">
573<pre>
574/// ForExprAST - Expression class for for/in.
575class ForExprAST : public ExprAST {
576 std::string VarName;
577 ExprAST *Start, *End, *Step, *Body;
578public:
579 ForExprAST(const std::string &amp;varname, ExprAST *start, ExprAST *end,
580 ExprAST *step, ExprAST *body)
581 : VarName(varname), Start(start), End(end), Step(step), Body(body) {}
582 virtual Value *Codegen();
583};
584</pre>
585</div>
586
587</div>
588
589<!-- ======================================================================= -->
NAKAMURA Takumi05d02652011-04-18 23:59:50 +0000590<h4><a name="forparser">Parser Extensions for the 'for' Loop</a></h4>
Chris Lattnerf5234802007-10-31 06:47:39 +0000591<!-- ======================================================================= -->
592
593<div class="doc_text">
594
595<p>The parser code is also fairly standard. The only interesting thing here is
596handling of the optional step value. The parser code handles it by checking to
597see if the second comma is present. If not, it sets the step value to null in
598the AST node:</p>
599
600<div class="doc_code">
601<pre>
Chris Lattner20a0c802007-11-05 17:54:34 +0000602/// forexpr ::= 'for' identifier '=' expr ',' expr (',' expr)? 'in' expression
Chris Lattnerf5234802007-10-31 06:47:39 +0000603static ExprAST *ParseForExpr() {
604 getNextToken(); // eat the for.
605
606 if (CurTok != tok_identifier)
607 return Error("expected identifier after for");
608
609 std::string IdName = IdentifierStr;
Chris Lattner20a0c802007-11-05 17:54:34 +0000610 getNextToken(); // eat identifier.
Chris Lattnerf5234802007-10-31 06:47:39 +0000611
612 if (CurTok != '=')
613 return Error("expected '=' after for");
614 getNextToken(); // eat '='.
615
616
617 ExprAST *Start = ParseExpression();
618 if (Start == 0) return 0;
619 if (CurTok != ',')
620 return Error("expected ',' after for start value");
621 getNextToken();
622
623 ExprAST *End = ParseExpression();
624 if (End == 0) return 0;
625
626 // The step value is optional.
627 ExprAST *Step = 0;
628 if (CurTok == ',') {
629 getNextToken();
630 Step = ParseExpression();
631 if (Step == 0) return 0;
632 }
633
634 if (CurTok != tok_in)
635 return Error("expected 'in' after for");
636 getNextToken(); // eat 'in'.
637
638 ExprAST *Body = ParseExpression();
639 if (Body == 0) return 0;
640
641 return new ForExprAST(IdName, Start, End, Step, Body);
642}
643</pre>
644</div>
645
646</div>
647
648<!-- ======================================================================= -->
NAKAMURA Takumi05d02652011-04-18 23:59:50 +0000649<h4><a name="forir">LLVM IR for the 'for' Loop</a></h4>
Chris Lattnerf5234802007-10-31 06:47:39 +0000650<!-- ======================================================================= -->
651
652<div class="doc_text">
653
654<p>Now we get to the good part: the LLVM IR we want to generate for this thing.
Gordon Henriksenbb310f12007-11-12 13:46:21 +0000655With the simple example above, we get this LLVM IR (note that this dump is
656generated with optimizations disabled for clarity):
Chris Lattnerf5234802007-10-31 06:47:39 +0000657</p>
658
Chris Lattner6093bd52007-10-31 07:29:43 +0000659<div class="doc_code">
660<pre>
661declare double @putchard(double)
Chris Lattnerf5234802007-10-31 06:47:39 +0000662
Chris Lattner6093bd52007-10-31 07:29:43 +0000663define double @printstar(double %n) {
664entry:
665 ; initial value = 1.0 (inlined into phi)
666 br label %loop
667
668loop: ; preds = %loop, %entry
669 %i = phi double [ 1.000000e+00, %entry ], [ %nextvar, %loop ]
670 ; body
Dan Gohman3dfb3cf2010-05-28 17:07:41 +0000671 %calltmp = call double @putchard(double 4.200000e+01)
Chris Lattner6093bd52007-10-31 07:29:43 +0000672 ; increment
Dan Gohmana9445e12010-03-02 01:11:08 +0000673 %nextvar = fadd double %i, 1.000000e+00
Chris Lattner6093bd52007-10-31 07:29:43 +0000674
675 ; termination test
Chris Lattner71155212007-11-06 01:39:12 +0000676 %cmptmp = fcmp ult double %i, %n
677 %booltmp = uitofp i1 %cmptmp to double
Chris Lattner6093bd52007-10-31 07:29:43 +0000678 %loopcond = fcmp one double %booltmp, 0.000000e+00
679 br i1 %loopcond, label %loop, label %afterloop
680
681afterloop: ; preds = %loop
682 ; loop always returns 0.0
683 ret double 0.000000e+00
684}
685</pre>
686</div>
687
688<p>This loop contains all the same constructs we saw before: a phi node, several
689expressions, and some basic blocks. Lets see how this fits together.</p>
Chris Lattnerf5234802007-10-31 06:47:39 +0000690
691</div>
692
693<!-- ======================================================================= -->
NAKAMURA Takumi05d02652011-04-18 23:59:50 +0000694<h4><a name="forcodegen">Code Generation for the 'for' Loop</a></h4>
Chris Lattnerf5234802007-10-31 06:47:39 +0000695<!-- ======================================================================= -->
696
697<div class="doc_text">
698
Chris Lattner41fcea32007-11-13 07:06:30 +0000699<p>The first part of Codegen is very simple: we just output the start expression
Chris Lattner6093bd52007-10-31 07:29:43 +0000700for the loop value:</p>
Chris Lattnerf5234802007-10-31 06:47:39 +0000701
702<div class="doc_code">
703<pre>
704Value *ForExprAST::Codegen() {
Chris Lattner6093bd52007-10-31 07:29:43 +0000705 // Emit the start code first, without 'variable' in scope.
706 Value *StartVal = Start-&gt;Codegen();
707 if (StartVal == 0) return 0;
708</pre>
709</div>
710
711<p>With this out of the way, the next step is to set up the LLVM basic block
712for the start of the loop body. In the case above, the whole loop body is one
713block, but remember that the body code itself could consist of multiple blocks
Chris Lattner1092a962007-11-07 05:47:48 +0000714(e.g. if it contains an if/then/else or a for/in expression).</p>
Chris Lattner6093bd52007-10-31 07:29:43 +0000715
716<div class="doc_code">
717<pre>
718 // Make the new basic block for the loop header, inserting after current
719 // block.
720 Function *TheFunction = Builder.GetInsertBlock()-&gt;getParent();
721 BasicBlock *PreheaderBB = Builder.GetInsertBlock();
Owen Anderson1d0be152009-08-13 21:58:54 +0000722 BasicBlock *LoopBB = BasicBlock::Create(getGlobalContext(), "loop", TheFunction);
Chris Lattner6093bd52007-10-31 07:29:43 +0000723
724 // Insert an explicit fall through from the current block to the LoopBB.
725 Builder.CreateBr(LoopBB);
726</pre>
727</div>
728
729<p>This code is similar to what we saw for if/then/else. Because we will need
730it to create the Phi node, we remember the block that falls through into the
731loop. Once we have that, we create the actual block that starts the loop and
732create an unconditional branch for the fall-through between the two blocks.</p>
733
734<div class="doc_code">
735<pre>
736 // Start insertion in LoopBB.
737 Builder.SetInsertPoint(LoopBB);
738
739 // Start the PHI node with an entry for Start.
Jay Foad3ecfc862011-03-30 11:28:46 +0000740 PHINode *Variable = Builder.CreatePHI(Type::getDoubleTy(getGlobalContext()), 2, VarName.c_str());
Chris Lattner6093bd52007-10-31 07:29:43 +0000741 Variable-&gt;addIncoming(StartVal, PreheaderBB);
742</pre>
743</div>
744
745<p>Now that the "preheader" for the loop is set up, we switch to emitting code
746for the loop body. To begin with, we move the insertion point and create the
Chris Lattner1092a962007-11-07 05:47:48 +0000747PHI node for the loop induction variable. Since we already know the incoming
Chris Lattner6093bd52007-10-31 07:29:43 +0000748value for the starting value, we add it to the Phi node. Note that the Phi will
749eventually get a second value for the backedge, but we can't set it up yet
750(because it doesn't exist!).</p>
751
752<div class="doc_code">
753<pre>
754 // Within the loop, the variable is defined equal to the PHI node. If it
755 // shadows an existing variable, we have to restore it, so save it now.
756 Value *OldVal = NamedValues[VarName];
757 NamedValues[VarName] = Variable;
758
759 // Emit the body of the loop. This, like any other expr, can change the
760 // current BB. Note that we ignore the value computed by the body, but don't
761 // allow an error.
762 if (Body-&gt;Codegen() == 0)
763 return 0;
764</pre>
765</div>
766
767<p>Now the code starts to get more interesting. Our 'for' loop introduces a new
768variable to the symbol table. This means that our symbol table can now contain
769either function arguments or loop variables. To handle this, before we codegen
770the body of the loop, we add the loop variable as the current value for its
771name. Note that it is possible that there is a variable of the same name in the
772outer scope. It would be easy to make this an error (emit an error and return
773null if there is already an entry for VarName) but we choose to allow shadowing
774of variables. In order to handle this correctly, we remember the Value that
775we are potentially shadowing in <tt>OldVal</tt> (which will be null if there is
776no shadowed variable).</p>
777
778<p>Once the loop variable is set into the symbol table, the code recursively
779codegen's the body. This allows the body to use the loop variable: any
780references to it will naturally find it in the symbol table.</p>
781
782<div class="doc_code">
783<pre>
784 // Emit the step value.
785 Value *StepVal;
786 if (Step) {
787 StepVal = Step-&gt;Codegen();
788 if (StepVal == 0) return 0;
789 } else {
790 // If not specified, use 1.0.
Owen Anderson6f83c9c2009-07-27 20:59:43 +0000791 StepVal = ConstantFP::get(getGlobalContext(), APFloat(1.0));
Chris Lattner6093bd52007-10-31 07:29:43 +0000792 }
793
Chris Lattnerf57fcf72010-09-01 20:09:20 +0000794 Value *NextVar = Builder.CreateFAdd(Variable, StepVal, "nextvar");
Chris Lattner6093bd52007-10-31 07:29:43 +0000795</pre>
796</div>
797
798<p>Now that the body is emitted, we compute the next value of the iteration
Chris Lattner41fcea32007-11-13 07:06:30 +0000799variable by adding the step value, or 1.0 if it isn't present. '<tt>NextVar</tt>'
Chris Lattner6093bd52007-10-31 07:29:43 +0000800will be the value of the loop variable on the next iteration of the loop.</p>
801
802<div class="doc_code">
803<pre>
804 // Compute the end condition.
805 Value *EndCond = End-&gt;Codegen();
806 if (EndCond == 0) return EndCond;
807
808 // Convert condition to a bool by comparing equal to 0.0.
809 EndCond = Builder.CreateFCmpONE(EndCond,
Owen Anderson6f83c9c2009-07-27 20:59:43 +0000810 ConstantFP::get(getGlobalContext(), APFloat(0.0)),
Chris Lattner6093bd52007-10-31 07:29:43 +0000811 "loopcond");
812</pre>
813</div>
814
815<p>Finally, we evaluate the exit value of the loop, to determine whether the
816loop should exit. This mirrors the condition evaluation for the if/then/else
817statement.</p>
818
819<div class="doc_code">
820<pre>
821 // Create the "after loop" block and insert it.
822 BasicBlock *LoopEndBB = Builder.GetInsertBlock();
Owen Anderson1d0be152009-08-13 21:58:54 +0000823 BasicBlock *AfterBB = BasicBlock::Create(getGlobalContext(), "afterloop", TheFunction);
Chris Lattner6093bd52007-10-31 07:29:43 +0000824
825 // Insert the conditional branch into the end of LoopEndBB.
826 Builder.CreateCondBr(EndCond, LoopBB, AfterBB);
827
828 // Any new code will be inserted in AfterBB.
829 Builder.SetInsertPoint(AfterBB);
830</pre>
831</div>
832
833<p>With the code for the body of the loop complete, we just need to finish up
Chris Lattner41fcea32007-11-13 07:06:30 +0000834the control flow for it. This code remembers the end block (for the phi node), then creates the block for the loop exit ("afterloop"). Based on the value of the
Chris Lattner6093bd52007-10-31 07:29:43 +0000835exit condition, it creates a conditional branch that chooses between executing
836the loop again and exiting the loop. Any future code is emitted in the
837"afterloop" block, so it sets the insertion position to it.</p>
838
839<div class="doc_code">
840<pre>
841 // Add a new entry to the PHI node for the backedge.
842 Variable-&gt;addIncoming(NextVar, LoopEndBB);
843
844 // Restore the unshadowed variable.
845 if (OldVal)
846 NamedValues[VarName] = OldVal;
847 else
848 NamedValues.erase(VarName);
849
850 // for expr always returns 0.0.
Owen Anderson1d0be152009-08-13 21:58:54 +0000851 return Constant::getNullValue(Type::getDoubleTy(getGlobalContext()));
Chris Lattner6093bd52007-10-31 07:29:43 +0000852}
853</pre>
854</div>
855
856<p>The final code handles various cleanups: now that we have the "NextVar"
857value, we can add the incoming value to the loop PHI node. After that, we
858remove the loop variable from the symbol table, so that it isn't in scope after
859the for loop. Finally, code generation of the for loop always returns 0.0, so
860that is what we return from <tt>ForExprAST::Codegen</tt>.</p>
861
862<p>With this, we conclude the "adding control flow to Kaleidoscope" chapter of
Chris Lattner41fcea32007-11-13 07:06:30 +0000863the tutorial. In this chapter we added two control flow constructs, and used them to motivate a couple of aspects of the LLVM IR that are important for front-end implementors
Chris Lattner6093bd52007-10-31 07:29:43 +0000864to know. In the next chapter of our saga, we will get a bit crazier and add
Chris Lattner71155212007-11-06 01:39:12 +0000865<a href="LangImpl6.html">user-defined operators</a> to our poor innocent
866language.</p>
Chris Lattner6093bd52007-10-31 07:29:43 +0000867
868</div>
869
870<!-- *********************************************************************** -->
NAKAMURA Takumi05d02652011-04-18 23:59:50 +0000871<h2><a name="code">Full Code Listing</a></h2>
Chris Lattner6093bd52007-10-31 07:29:43 +0000872<!-- *********************************************************************** -->
873
874<div class="doc_text">
875
876<p>
877Here is the complete code listing for our running example, enhanced with the
878if/then/else and for expressions.. To build this example, use:
879</p>
880
881<div class="doc_code">
882<pre>
883 # Compile
884 g++ -g toy.cpp `llvm-config --cppflags --ldflags --libs core jit native` -O3 -o toy
885 # Run
886 ./toy
887</pre>
888</div>
889
890<p>Here is the code:</p>
891
892<div class="doc_code">
893<pre>
894#include "llvm/DerivedTypes.h"
895#include "llvm/ExecutionEngine/ExecutionEngine.h"
Nick Lewycky422094c2009-09-13 21:38:54 +0000896#include "llvm/ExecutionEngine/JIT.h"
Owen Andersond1fbd142009-07-08 20:50:47 +0000897#include "llvm/LLVMContext.h"
Chris Lattner6093bd52007-10-31 07:29:43 +0000898#include "llvm/Module.h"
Chris Lattner6093bd52007-10-31 07:29:43 +0000899#include "llvm/PassManager.h"
900#include "llvm/Analysis/Verifier.h"
Dan Gohmanab7fa082010-11-16 17:28:22 +0000901#include "llvm/Analysis/Passes.h"
Chris Lattner6093bd52007-10-31 07:29:43 +0000902#include "llvm/Target/TargetData.h"
Nick Lewycky422094c2009-09-13 21:38:54 +0000903#include "llvm/Target/TargetSelect.h"
Chris Lattner6093bd52007-10-31 07:29:43 +0000904#include "llvm/Transforms/Scalar.h"
Duncan Sands89f6d882008-04-13 06:22:09 +0000905#include "llvm/Support/IRBuilder.h"
Chris Lattner6093bd52007-10-31 07:29:43 +0000906#include &lt;cstdio&gt;
907#include &lt;string&gt;
908#include &lt;map&gt;
909#include &lt;vector&gt;
910using namespace llvm;
911
912//===----------------------------------------------------------------------===//
913// Lexer
914//===----------------------------------------------------------------------===//
915
916// The lexer returns tokens [0-255] if it is an unknown character, otherwise one
917// of these for known things.
918enum Token {
919 tok_eof = -1,
920
921 // commands
922 tok_def = -2, tok_extern = -3,
923
924 // primary
925 tok_identifier = -4, tok_number = -5,
926
927 // control
928 tok_if = -6, tok_then = -7, tok_else = -8,
929 tok_for = -9, tok_in = -10
930};
931
932static std::string IdentifierStr; // Filled in if tok_identifier
933static double NumVal; // Filled in if tok_number
934
935/// gettok - Return the next token from standard input.
936static int gettok() {
937 static int LastChar = ' ';
938
939 // Skip any whitespace.
940 while (isspace(LastChar))
941 LastChar = getchar();
942
943 if (isalpha(LastChar)) { // identifier: [a-zA-Z][a-zA-Z0-9]*
944 IdentifierStr = LastChar;
945 while (isalnum((LastChar = getchar())))
946 IdentifierStr += LastChar;
947
948 if (IdentifierStr == "def") return tok_def;
949 if (IdentifierStr == "extern") return tok_extern;
950 if (IdentifierStr == "if") return tok_if;
951 if (IdentifierStr == "then") return tok_then;
952 if (IdentifierStr == "else") return tok_else;
953 if (IdentifierStr == "for") return tok_for;
954 if (IdentifierStr == "in") return tok_in;
955 return tok_identifier;
956 }
957
958 if (isdigit(LastChar) || LastChar == '.') { // Number: [0-9.]+
959 std::string NumStr;
960 do {
961 NumStr += LastChar;
962 LastChar = getchar();
963 } while (isdigit(LastChar) || LastChar == '.');
964
965 NumVal = strtod(NumStr.c_str(), 0);
966 return tok_number;
967 }
968
969 if (LastChar == '#') {
970 // Comment until end of line.
971 do LastChar = getchar();
Chris Lattnerc80c23f2007-12-02 22:46:01 +0000972 while (LastChar != EOF &amp;&amp; LastChar != '\n' &amp;&amp; LastChar != '\r');
Chris Lattner6093bd52007-10-31 07:29:43 +0000973
974 if (LastChar != EOF)
975 return gettok();
976 }
977
978 // Check for end of file. Don't eat the EOF.
979 if (LastChar == EOF)
980 return tok_eof;
981
982 // Otherwise, just return the character as its ascii value.
983 int ThisChar = LastChar;
984 LastChar = getchar();
985 return ThisChar;
986}
987
988//===----------------------------------------------------------------------===//
989// Abstract Syntax Tree (aka Parse Tree)
990//===----------------------------------------------------------------------===//
991
992/// ExprAST - Base class for all expression nodes.
993class ExprAST {
994public:
995 virtual ~ExprAST() {}
996 virtual Value *Codegen() = 0;
997};
998
999/// NumberExprAST - Expression class for numeric literals like "1.0".
1000class NumberExprAST : public ExprAST {
1001 double Val;
1002public:
1003 NumberExprAST(double val) : Val(val) {}
1004 virtual Value *Codegen();
1005};
1006
1007/// VariableExprAST - Expression class for referencing a variable, like "a".
1008class VariableExprAST : public ExprAST {
1009 std::string Name;
1010public:
1011 VariableExprAST(const std::string &amp;name) : Name(name) {}
1012 virtual Value *Codegen();
1013};
1014
1015/// BinaryExprAST - Expression class for a binary operator.
1016class BinaryExprAST : public ExprAST {
1017 char Op;
1018 ExprAST *LHS, *RHS;
1019public:
1020 BinaryExprAST(char op, ExprAST *lhs, ExprAST *rhs)
1021 : Op(op), LHS(lhs), RHS(rhs) {}
1022 virtual Value *Codegen();
1023};
1024
1025/// CallExprAST - Expression class for function calls.
1026class CallExprAST : public ExprAST {
1027 std::string Callee;
1028 std::vector&lt;ExprAST*&gt; Args;
1029public:
1030 CallExprAST(const std::string &amp;callee, std::vector&lt;ExprAST*&gt; &amp;args)
1031 : Callee(callee), Args(args) {}
1032 virtual Value *Codegen();
1033};
1034
1035/// IfExprAST - Expression class for if/then/else.
1036class IfExprAST : public ExprAST {
1037 ExprAST *Cond, *Then, *Else;
1038public:
1039 IfExprAST(ExprAST *cond, ExprAST *then, ExprAST *_else)
1040 : Cond(cond), Then(then), Else(_else) {}
1041 virtual Value *Codegen();
1042};
1043
1044/// ForExprAST - Expression class for for/in.
1045class ForExprAST : public ExprAST {
1046 std::string VarName;
1047 ExprAST *Start, *End, *Step, *Body;
1048public:
1049 ForExprAST(const std::string &amp;varname, ExprAST *start, ExprAST *end,
1050 ExprAST *step, ExprAST *body)
1051 : VarName(varname), Start(start), End(end), Step(step), Body(body) {}
1052 virtual Value *Codegen();
1053};
1054
1055/// PrototypeAST - This class represents the "prototype" for a function,
Erick Tryzelaarfd1ec5e2009-09-22 21:14:49 +00001056/// which captures its name, and its argument names (thus implicitly the number
1057/// of arguments the function takes).
Chris Lattner6093bd52007-10-31 07:29:43 +00001058class PrototypeAST {
1059 std::string Name;
1060 std::vector&lt;std::string&gt; Args;
1061public:
1062 PrototypeAST(const std::string &amp;name, const std::vector&lt;std::string&gt; &amp;args)
1063 : Name(name), Args(args) {}
1064
1065 Function *Codegen();
1066};
1067
1068/// FunctionAST - This class represents a function definition itself.
1069class FunctionAST {
1070 PrototypeAST *Proto;
1071 ExprAST *Body;
1072public:
1073 FunctionAST(PrototypeAST *proto, ExprAST *body)
1074 : Proto(proto), Body(body) {}
1075
1076 Function *Codegen();
1077};
1078
1079//===----------------------------------------------------------------------===//
1080// Parser
1081//===----------------------------------------------------------------------===//
1082
1083/// CurTok/getNextToken - Provide a simple token buffer. CurTok is the current
Erick Tryzelaarfd1ec5e2009-09-22 21:14:49 +00001084/// token the parser is looking at. getNextToken reads another token from the
Chris Lattner6093bd52007-10-31 07:29:43 +00001085/// lexer and updates CurTok with its results.
1086static int CurTok;
1087static int getNextToken() {
1088 return CurTok = gettok();
1089}
1090
1091/// BinopPrecedence - This holds the precedence for each binary operator that is
1092/// defined.
1093static std::map&lt;char, int&gt; BinopPrecedence;
1094
1095/// GetTokPrecedence - Get the precedence of the pending binary operator token.
1096static int GetTokPrecedence() {
1097 if (!isascii(CurTok))
1098 return -1;
1099
1100 // Make sure it's a declared binop.
1101 int TokPrec = BinopPrecedence[CurTok];
1102 if (TokPrec &lt;= 0) return -1;
1103 return TokPrec;
1104}
1105
1106/// Error* - These are little helper functions for error handling.
1107ExprAST *Error(const char *Str) { fprintf(stderr, "Error: %s\n", Str);return 0;}
1108PrototypeAST *ErrorP(const char *Str) { Error(Str); return 0; }
1109FunctionAST *ErrorF(const char *Str) { Error(Str); return 0; }
1110
1111static ExprAST *ParseExpression();
1112
1113/// identifierexpr
Chris Lattner20a0c802007-11-05 17:54:34 +00001114/// ::= identifier
1115/// ::= identifier '(' expression* ')'
Chris Lattner6093bd52007-10-31 07:29:43 +00001116static ExprAST *ParseIdentifierExpr() {
1117 std::string IdName = IdentifierStr;
1118
Chris Lattner20a0c802007-11-05 17:54:34 +00001119 getNextToken(); // eat identifier.
Chris Lattner6093bd52007-10-31 07:29:43 +00001120
1121 if (CurTok != '(') // Simple variable ref.
1122 return new VariableExprAST(IdName);
1123
1124 // Call.
1125 getNextToken(); // eat (
1126 std::vector&lt;ExprAST*&gt; Args;
1127 if (CurTok != ')') {
1128 while (1) {
1129 ExprAST *Arg = ParseExpression();
1130 if (!Arg) return 0;
1131 Args.push_back(Arg);
Erick Tryzelaarfd1ec5e2009-09-22 21:14:49 +00001132
Chris Lattner6093bd52007-10-31 07:29:43 +00001133 if (CurTok == ')') break;
Erick Tryzelaarfd1ec5e2009-09-22 21:14:49 +00001134
Chris Lattner6093bd52007-10-31 07:29:43 +00001135 if (CurTok != ',')
Chris Lattner6c4be9c2008-04-14 16:44:41 +00001136 return Error("Expected ')' or ',' in argument list");
Chris Lattner6093bd52007-10-31 07:29:43 +00001137 getNextToken();
1138 }
1139 }
1140
1141 // Eat the ')'.
1142 getNextToken();
1143
1144 return new CallExprAST(IdName, Args);
1145}
1146
1147/// numberexpr ::= number
1148static ExprAST *ParseNumberExpr() {
1149 ExprAST *Result = new NumberExprAST(NumVal);
1150 getNextToken(); // consume the number
1151 return Result;
1152}
1153
1154/// parenexpr ::= '(' expression ')'
1155static ExprAST *ParseParenExpr() {
1156 getNextToken(); // eat (.
1157 ExprAST *V = ParseExpression();
1158 if (!V) return 0;
1159
1160 if (CurTok != ')')
1161 return Error("expected ')'");
1162 getNextToken(); // eat ).
1163 return V;
1164}
1165
1166/// ifexpr ::= 'if' expression 'then' expression 'else' expression
1167static ExprAST *ParseIfExpr() {
1168 getNextToken(); // eat the if.
1169
1170 // condition.
1171 ExprAST *Cond = ParseExpression();
1172 if (!Cond) return 0;
1173
1174 if (CurTok != tok_then)
1175 return Error("expected then");
1176 getNextToken(); // eat the then
1177
1178 ExprAST *Then = ParseExpression();
1179 if (Then == 0) return 0;
1180
1181 if (CurTok != tok_else)
1182 return Error("expected else");
1183
1184 getNextToken();
1185
1186 ExprAST *Else = ParseExpression();
1187 if (!Else) return 0;
1188
1189 return new IfExprAST(Cond, Then, Else);
1190}
1191
Chris Lattner20a0c802007-11-05 17:54:34 +00001192/// forexpr ::= 'for' identifier '=' expr ',' expr (',' expr)? 'in' expression
Chris Lattner6093bd52007-10-31 07:29:43 +00001193static ExprAST *ParseForExpr() {
1194 getNextToken(); // eat the for.
1195
1196 if (CurTok != tok_identifier)
1197 return Error("expected identifier after for");
1198
1199 std::string IdName = IdentifierStr;
Chris Lattner20a0c802007-11-05 17:54:34 +00001200 getNextToken(); // eat identifier.
Chris Lattner6093bd52007-10-31 07:29:43 +00001201
1202 if (CurTok != '=')
1203 return Error("expected '=' after for");
1204 getNextToken(); // eat '='.
1205
1206
1207 ExprAST *Start = ParseExpression();
1208 if (Start == 0) return 0;
1209 if (CurTok != ',')
1210 return Error("expected ',' after for start value");
1211 getNextToken();
1212
1213 ExprAST *End = ParseExpression();
1214 if (End == 0) return 0;
1215
1216 // The step value is optional.
1217 ExprAST *Step = 0;
1218 if (CurTok == ',') {
1219 getNextToken();
1220 Step = ParseExpression();
1221 if (Step == 0) return 0;
1222 }
1223
1224 if (CurTok != tok_in)
1225 return Error("expected 'in' after for");
1226 getNextToken(); // eat 'in'.
1227
1228 ExprAST *Body = ParseExpression();
1229 if (Body == 0) return 0;
1230
1231 return new ForExprAST(IdName, Start, End, Step, Body);
1232}
1233
Chris Lattner6093bd52007-10-31 07:29:43 +00001234/// primary
1235/// ::= identifierexpr
1236/// ::= numberexpr
1237/// ::= parenexpr
1238/// ::= ifexpr
1239/// ::= forexpr
1240static ExprAST *ParsePrimary() {
1241 switch (CurTok) {
1242 default: return Error("unknown token when expecting an expression");
1243 case tok_identifier: return ParseIdentifierExpr();
1244 case tok_number: return ParseNumberExpr();
1245 case '(': return ParseParenExpr();
1246 case tok_if: return ParseIfExpr();
1247 case tok_for: return ParseForExpr();
1248 }
1249}
1250
1251/// binoprhs
1252/// ::= ('+' primary)*
1253static ExprAST *ParseBinOpRHS(int ExprPrec, ExprAST *LHS) {
1254 // If this is a binop, find its precedence.
1255 while (1) {
1256 int TokPrec = GetTokPrecedence();
1257
1258 // If this is a binop that binds at least as tightly as the current binop,
1259 // consume it, otherwise we are done.
1260 if (TokPrec &lt; ExprPrec)
1261 return LHS;
1262
1263 // Okay, we know this is a binop.
1264 int BinOp = CurTok;
1265 getNextToken(); // eat binop
1266
1267 // Parse the primary expression after the binary operator.
1268 ExprAST *RHS = ParsePrimary();
1269 if (!RHS) return 0;
1270
1271 // If BinOp binds less tightly with RHS than the operator after RHS, let
1272 // the pending operator take RHS as its LHS.
1273 int NextPrec = GetTokPrecedence();
1274 if (TokPrec &lt; NextPrec) {
1275 RHS = ParseBinOpRHS(TokPrec+1, RHS);
1276 if (RHS == 0) return 0;
1277 }
1278
1279 // Merge LHS/RHS.
1280 LHS = new BinaryExprAST(BinOp, LHS, RHS);
1281 }
1282}
1283
1284/// expression
1285/// ::= primary binoprhs
1286///
1287static ExprAST *ParseExpression() {
1288 ExprAST *LHS = ParsePrimary();
1289 if (!LHS) return 0;
1290
1291 return ParseBinOpRHS(0, LHS);
1292}
1293
1294/// prototype
1295/// ::= id '(' id* ')'
1296static PrototypeAST *ParsePrototype() {
1297 if (CurTok != tok_identifier)
1298 return ErrorP("Expected function name in prototype");
1299
1300 std::string FnName = IdentifierStr;
1301 getNextToken();
1302
1303 if (CurTok != '(')
1304 return ErrorP("Expected '(' in prototype");
1305
1306 std::vector&lt;std::string&gt; ArgNames;
1307 while (getNextToken() == tok_identifier)
1308 ArgNames.push_back(IdentifierStr);
1309 if (CurTok != ')')
1310 return ErrorP("Expected ')' in prototype");
1311
1312 // success.
1313 getNextToken(); // eat ')'.
1314
1315 return new PrototypeAST(FnName, ArgNames);
1316}
1317
1318/// definition ::= 'def' prototype expression
1319static FunctionAST *ParseDefinition() {
1320 getNextToken(); // eat def.
1321 PrototypeAST *Proto = ParsePrototype();
1322 if (Proto == 0) return 0;
1323
1324 if (ExprAST *E = ParseExpression())
1325 return new FunctionAST(Proto, E);
1326 return 0;
1327}
1328
1329/// toplevelexpr ::= expression
1330static FunctionAST *ParseTopLevelExpr() {
1331 if (ExprAST *E = ParseExpression()) {
1332 // Make an anonymous proto.
1333 PrototypeAST *Proto = new PrototypeAST("", std::vector&lt;std::string&gt;());
1334 return new FunctionAST(Proto, E);
1335 }
1336 return 0;
1337}
1338
1339/// external ::= 'extern' prototype
1340static PrototypeAST *ParseExtern() {
1341 getNextToken(); // eat extern.
1342 return ParsePrototype();
1343}
1344
1345//===----------------------------------------------------------------------===//
1346// Code Generation
1347//===----------------------------------------------------------------------===//
1348
1349static Module *TheModule;
Owen Andersond1fbd142009-07-08 20:50:47 +00001350static IRBuilder&lt;&gt; Builder(getGlobalContext());
Chris Lattner6093bd52007-10-31 07:29:43 +00001351static std::map&lt;std::string, Value*&gt; NamedValues;
1352static FunctionPassManager *TheFPM;
1353
1354Value *ErrorV(const char *Str) { Error(Str); return 0; }
1355
1356Value *NumberExprAST::Codegen() {
Owen Anderson6f83c9c2009-07-27 20:59:43 +00001357 return ConstantFP::get(getGlobalContext(), APFloat(Val));
Chris Lattner6093bd52007-10-31 07:29:43 +00001358}
1359
1360Value *VariableExprAST::Codegen() {
1361 // Look this variable up in the function.
1362 Value *V = NamedValues[Name];
1363 return V ? V : ErrorV("Unknown variable name");
1364}
1365
1366Value *BinaryExprAST::Codegen() {
1367 Value *L = LHS-&gt;Codegen();
1368 Value *R = RHS-&gt;Codegen();
1369 if (L == 0 || R == 0) return 0;
1370
1371 switch (Op) {
Eric Christopher2214b812010-06-14 06:09:39 +00001372 case '+': return Builder.CreateFAdd(L, R, "addtmp");
1373 case '-': return Builder.CreateFSub(L, R, "subtmp");
1374 case '*': return Builder.CreateFMul(L, R, "multmp");
Chris Lattner6093bd52007-10-31 07:29:43 +00001375 case '&lt;':
Chris Lattner71155212007-11-06 01:39:12 +00001376 L = Builder.CreateFCmpULT(L, R, "cmptmp");
Chris Lattner6093bd52007-10-31 07:29:43 +00001377 // Convert bool 0/1 to double 0.0 or 1.0
Nick Lewycky422094c2009-09-13 21:38:54 +00001378 return Builder.CreateUIToFP(L, Type::getDoubleTy(getGlobalContext()),
1379 "booltmp");
Chris Lattner6093bd52007-10-31 07:29:43 +00001380 default: return ErrorV("invalid binary operator");
1381 }
1382}
1383
1384Value *CallExprAST::Codegen() {
1385 // Look up the name in the global module table.
1386 Function *CalleeF = TheModule-&gt;getFunction(Callee);
1387 if (CalleeF == 0)
1388 return ErrorV("Unknown function referenced");
1389
1390 // If argument mismatch error.
1391 if (CalleeF-&gt;arg_size() != Args.size())
1392 return ErrorV("Incorrect # arguments passed");
1393
1394 std::vector&lt;Value*&gt; ArgsV;
1395 for (unsigned i = 0, e = Args.size(); i != e; ++i) {
1396 ArgsV.push_back(Args[i]-&gt;Codegen());
1397 if (ArgsV.back() == 0) return 0;
1398 }
1399
1400 return Builder.CreateCall(CalleeF, ArgsV.begin(), ArgsV.end(), "calltmp");
1401}
1402
1403Value *IfExprAST::Codegen() {
1404 Value *CondV = Cond-&gt;Codegen();
1405 if (CondV == 0) return 0;
1406
1407 // Convert condition to a bool by comparing equal to 0.0.
1408 CondV = Builder.CreateFCmpONE(CondV,
Owen Anderson6f83c9c2009-07-27 20:59:43 +00001409 ConstantFP::get(getGlobalContext(), APFloat(0.0)),
Chris Lattner6093bd52007-10-31 07:29:43 +00001410 "ifcond");
1411
1412 Function *TheFunction = Builder.GetInsertBlock()-&gt;getParent();
1413
1414 // Create blocks for the then and else cases. Insert the 'then' block at the
1415 // end of the function.
Owen Anderson1d0be152009-08-13 21:58:54 +00001416 BasicBlock *ThenBB = BasicBlock::Create(getGlobalContext(), "then", TheFunction);
1417 BasicBlock *ElseBB = BasicBlock::Create(getGlobalContext(), "else");
1418 BasicBlock *MergeBB = BasicBlock::Create(getGlobalContext(), "ifcont");
Chris Lattner6093bd52007-10-31 07:29:43 +00001419
1420 Builder.CreateCondBr(CondV, ThenBB, ElseBB);
1421
1422 // Emit then value.
1423 Builder.SetInsertPoint(ThenBB);
1424
1425 Value *ThenV = Then-&gt;Codegen();
1426 if (ThenV == 0) return 0;
1427
1428 Builder.CreateBr(MergeBB);
1429 // Codegen of 'Then' can change the current block, update ThenBB for the PHI.
1430 ThenBB = Builder.GetInsertBlock();
1431
1432 // Emit else block.
1433 TheFunction-&gt;getBasicBlockList().push_back(ElseBB);
1434 Builder.SetInsertPoint(ElseBB);
1435
1436 Value *ElseV = Else-&gt;Codegen();
1437 if (ElseV == 0) return 0;
1438
1439 Builder.CreateBr(MergeBB);
1440 // Codegen of 'Else' can change the current block, update ElseBB for the PHI.
1441 ElseBB = Builder.GetInsertBlock();
1442
1443 // Emit merge block.
1444 TheFunction-&gt;getBasicBlockList().push_back(MergeBB);
1445 Builder.SetInsertPoint(MergeBB);
Jay Foad3ecfc862011-03-30 11:28:46 +00001446 PHINode *PN = Builder.CreatePHI(Type::getDoubleTy(getGlobalContext()), 2,
Nick Lewycky422094c2009-09-13 21:38:54 +00001447 "iftmp");
Chris Lattner6093bd52007-10-31 07:29:43 +00001448
1449 PN-&gt;addIncoming(ThenV, ThenBB);
1450 PN-&gt;addIncoming(ElseV, ElseBB);
1451 return PN;
1452}
1453
1454Value *ForExprAST::Codegen() {
Chris Lattnerf5234802007-10-31 06:47:39 +00001455 // Output this as:
1456 // ...
1457 // start = startexpr
1458 // goto loop
1459 // loop:
1460 // variable = phi [start, loopheader], [nextvariable, loopend]
1461 // ...
1462 // bodyexpr
1463 // ...
1464 // loopend:
1465 // step = stepexpr
1466 // nextvariable = variable + step
1467 // endcond = endexpr
1468 // br endcond, loop, endloop
1469 // outloop:
1470
1471 // Emit the start code first, without 'variable' in scope.
1472 Value *StartVal = Start-&gt;Codegen();
1473 if (StartVal == 0) return 0;
1474
1475 // Make the new basic block for the loop header, inserting after current
1476 // block.
1477 Function *TheFunction = Builder.GetInsertBlock()-&gt;getParent();
1478 BasicBlock *PreheaderBB = Builder.GetInsertBlock();
Owen Anderson1d0be152009-08-13 21:58:54 +00001479 BasicBlock *LoopBB = BasicBlock::Create(getGlobalContext(), "loop", TheFunction);
Chris Lattnerf5234802007-10-31 06:47:39 +00001480
1481 // Insert an explicit fall through from the current block to the LoopBB.
Chris Lattnerf5234802007-10-31 06:47:39 +00001482 Builder.CreateBr(LoopBB);
Chris Lattner6093bd52007-10-31 07:29:43 +00001483
1484 // Start insertion in LoopBB.
Chris Lattnerf5234802007-10-31 06:47:39 +00001485 Builder.SetInsertPoint(LoopBB);
1486
1487 // Start the PHI node with an entry for Start.
Jay Foad3ecfc862011-03-30 11:28:46 +00001488 PHINode *Variable = Builder.CreatePHI(Type::getDoubleTy(getGlobalContext()), 2, VarName.c_str());
Chris Lattnerf5234802007-10-31 06:47:39 +00001489 Variable-&gt;addIncoming(StartVal, PreheaderBB);
1490
1491 // Within the loop, the variable is defined equal to the PHI node. If it
1492 // shadows an existing variable, we have to restore it, so save it now.
1493 Value *OldVal = NamedValues[VarName];
1494 NamedValues[VarName] = Variable;
1495
1496 // Emit the body of the loop. This, like any other expr, can change the
1497 // current BB. Note that we ignore the value computed by the body, but don't
1498 // allow an error.
1499 if (Body-&gt;Codegen() == 0)
1500 return 0;
1501
1502 // Emit the step value.
1503 Value *StepVal;
1504 if (Step) {
1505 StepVal = Step-&gt;Codegen();
1506 if (StepVal == 0) return 0;
1507 } else {
1508 // If not specified, use 1.0.
Owen Anderson6f83c9c2009-07-27 20:59:43 +00001509 StepVal = ConstantFP::get(getGlobalContext(), APFloat(1.0));
Chris Lattnerf5234802007-10-31 06:47:39 +00001510 }
1511
Chris Lattnerf57fcf72010-09-01 20:09:20 +00001512 Value *NextVar = Builder.CreateFAdd(Variable, StepVal, "nextvar");
Chris Lattnerf5234802007-10-31 06:47:39 +00001513
Chris Lattnerf5234802007-10-31 06:47:39 +00001514 // Compute the end condition.
1515 Value *EndCond = End-&gt;Codegen();
1516 if (EndCond == 0) return EndCond;
1517
1518 // Convert condition to a bool by comparing equal to 0.0.
1519 EndCond = Builder.CreateFCmpONE(EndCond,
Owen Anderson6f83c9c2009-07-27 20:59:43 +00001520 ConstantFP::get(getGlobalContext(), APFloat(0.0)),
Chris Lattnerf5234802007-10-31 06:47:39 +00001521 "loopcond");
1522
1523 // Create the "after loop" block and insert it.
1524 BasicBlock *LoopEndBB = Builder.GetInsertBlock();
Owen Anderson1d0be152009-08-13 21:58:54 +00001525 BasicBlock *AfterBB = BasicBlock::Create(getGlobalContext(), "afterloop", TheFunction);
Chris Lattnerf5234802007-10-31 06:47:39 +00001526
1527 // Insert the conditional branch into the end of LoopEndBB.
1528 Builder.CreateCondBr(EndCond, LoopBB, AfterBB);
1529
1530 // Any new code will be inserted in AfterBB.
1531 Builder.SetInsertPoint(AfterBB);
1532
1533 // Add a new entry to the PHI node for the backedge.
1534 Variable-&gt;addIncoming(NextVar, LoopEndBB);
1535
1536 // Restore the unshadowed variable.
1537 if (OldVal)
1538 NamedValues[VarName] = OldVal;
1539 else
1540 NamedValues.erase(VarName);
1541
1542
1543 // for expr always returns 0.0.
Erick Tryzelaarfd1ec5e2009-09-22 21:14:49 +00001544 return Constant::getNullValue(Type::getDoubleTy(getGlobalContext()));
Chris Lattnerf5234802007-10-31 06:47:39 +00001545}
Chris Lattner6093bd52007-10-31 07:29:43 +00001546
1547Function *PrototypeAST::Codegen() {
1548 // Make the function type: double(double,double) etc.
Nick Lewycky422094c2009-09-13 21:38:54 +00001549 std::vector&lt;const Type*&gt; Doubles(Args.size(),
1550 Type::getDoubleTy(getGlobalContext()));
1551 FunctionType *FT = FunctionType::get(Type::getDoubleTy(getGlobalContext()),
1552 Doubles, false);
Chris Lattner6093bd52007-10-31 07:29:43 +00001553
Gabor Greifdf7d2b42008-04-19 22:25:09 +00001554 Function *F = Function::Create(FT, Function::ExternalLinkage, Name, TheModule);
Chris Lattner6093bd52007-10-31 07:29:43 +00001555
1556 // If F conflicted, there was already something named 'Name'. If it has a
1557 // body, don't allow redefinition or reextern.
1558 if (F-&gt;getName() != Name) {
1559 // Delete the one we just made and get the existing one.
1560 F-&gt;eraseFromParent();
1561 F = TheModule-&gt;getFunction(Name);
1562
1563 // If F already has a body, reject this.
1564 if (!F-&gt;empty()) {
1565 ErrorF("redefinition of function");
1566 return 0;
1567 }
1568
1569 // If F took a different number of args, reject.
1570 if (F-&gt;arg_size() != Args.size()) {
1571 ErrorF("redefinition of function with different # args");
1572 return 0;
1573 }
1574 }
1575
1576 // Set names for all arguments.
1577 unsigned Idx = 0;
1578 for (Function::arg_iterator AI = F-&gt;arg_begin(); Idx != Args.size();
1579 ++AI, ++Idx) {
1580 AI-&gt;setName(Args[Idx]);
1581
1582 // Add arguments to variable symbol table.
1583 NamedValues[Args[Idx]] = AI;
1584 }
1585
1586 return F;
1587}
1588
1589Function *FunctionAST::Codegen() {
1590 NamedValues.clear();
1591
1592 Function *TheFunction = Proto-&gt;Codegen();
1593 if (TheFunction == 0)
1594 return 0;
1595
1596 // Create a new basic block to start insertion into.
Owen Anderson1d0be152009-08-13 21:58:54 +00001597 BasicBlock *BB = BasicBlock::Create(getGlobalContext(), "entry", TheFunction);
Chris Lattner6093bd52007-10-31 07:29:43 +00001598 Builder.SetInsertPoint(BB);
1599
1600 if (Value *RetVal = Body-&gt;Codegen()) {
1601 // Finish off the function.
1602 Builder.CreateRet(RetVal);
1603
1604 // Validate the generated code, checking for consistency.
1605 verifyFunction(*TheFunction);
1606
1607 // Optimize the function.
1608 TheFPM-&gt;run(*TheFunction);
1609
1610 return TheFunction;
1611 }
1612
1613 // Error reading body, remove function.
1614 TheFunction-&gt;eraseFromParent();
1615 return 0;
1616}
1617
1618//===----------------------------------------------------------------------===//
1619// Top-Level parsing and JIT Driver
1620//===----------------------------------------------------------------------===//
1621
1622static ExecutionEngine *TheExecutionEngine;
1623
1624static void HandleDefinition() {
1625 if (FunctionAST *F = ParseDefinition()) {
1626 if (Function *LF = F-&gt;Codegen()) {
1627 fprintf(stderr, "Read function definition:");
1628 LF-&gt;dump();
1629 }
1630 } else {
1631 // Skip token for error recovery.
1632 getNextToken();
1633 }
1634}
1635
1636static void HandleExtern() {
1637 if (PrototypeAST *P = ParseExtern()) {
1638 if (Function *F = P-&gt;Codegen()) {
1639 fprintf(stderr, "Read extern: ");
1640 F-&gt;dump();
1641 }
1642 } else {
1643 // Skip token for error recovery.
1644 getNextToken();
1645 }
1646}
1647
1648static void HandleTopLevelExpression() {
Erick Tryzelaarfd1ec5e2009-09-22 21:14:49 +00001649 // Evaluate a top-level expression into an anonymous function.
Chris Lattner6093bd52007-10-31 07:29:43 +00001650 if (FunctionAST *F = ParseTopLevelExpr()) {
1651 if (Function *LF = F-&gt;Codegen()) {
1652 // JIT the function, returning a function pointer.
1653 void *FPtr = TheExecutionEngine-&gt;getPointerToFunction(LF);
1654
1655 // Cast it to the right type (takes no arguments, returns a double) so we
1656 // can call it as a native function.
Nick Lewycky422094c2009-09-13 21:38:54 +00001657 double (*FP)() = (double (*)())(intptr_t)FPtr;
Chris Lattner6093bd52007-10-31 07:29:43 +00001658 fprintf(stderr, "Evaluated to %f\n", FP());
1659 }
1660 } else {
1661 // Skip token for error recovery.
1662 getNextToken();
1663 }
1664}
1665
1666/// top ::= definition | external | expression | ';'
1667static void MainLoop() {
1668 while (1) {
1669 fprintf(stderr, "ready&gt; ");
1670 switch (CurTok) {
1671 case tok_eof: return;
Erick Tryzelaarfd1ec5e2009-09-22 21:14:49 +00001672 case ';': getNextToken(); break; // ignore top-level semicolons.
Chris Lattner6093bd52007-10-31 07:29:43 +00001673 case tok_def: HandleDefinition(); break;
1674 case tok_extern: HandleExtern(); break;
1675 default: HandleTopLevelExpression(); break;
1676 }
1677 }
1678}
Chris Lattnerf5234802007-10-31 06:47:39 +00001679
Chris Lattner6093bd52007-10-31 07:29:43 +00001680//===----------------------------------------------------------------------===//
1681// "Library" functions that can be "extern'd" from user code.
1682//===----------------------------------------------------------------------===//
Chris Lattner602c832c2007-10-31 06:30:21 +00001683
Chris Lattner6093bd52007-10-31 07:29:43 +00001684/// putchard - putchar that takes a double and returns 0.
1685extern "C"
1686double putchard(double X) {
1687 putchar((char)X);
1688 return 0;
1689}
Chris Lattner602c832c2007-10-31 06:30:21 +00001690
Chris Lattner6093bd52007-10-31 07:29:43 +00001691//===----------------------------------------------------------------------===//
1692// Main driver code.
1693//===----------------------------------------------------------------------===//
Chris Lattner602c832c2007-10-31 06:30:21 +00001694
Chris Lattner6093bd52007-10-31 07:29:43 +00001695int main() {
Erick Tryzelaarfd1ec5e2009-09-22 21:14:49 +00001696 InitializeNativeTarget();
1697 LLVMContext &amp;Context = getGlobalContext();
1698
Chris Lattner6093bd52007-10-31 07:29:43 +00001699 // Install standard binary operators.
1700 // 1 is lowest precedence.
1701 BinopPrecedence['&lt;'] = 10;
1702 BinopPrecedence['+'] = 20;
1703 BinopPrecedence['-'] = 20;
1704 BinopPrecedence['*'] = 40; // highest.
Chris Lattner602c832c2007-10-31 06:30:21 +00001705
Chris Lattner6093bd52007-10-31 07:29:43 +00001706 // Prime the first token.
1707 fprintf(stderr, "ready&gt; ");
1708 getNextToken();
Chris Lattner602c832c2007-10-31 06:30:21 +00001709
Chris Lattner6093bd52007-10-31 07:29:43 +00001710 // Make the module, which holds all the code.
Erick Tryzelaarfd1ec5e2009-09-22 21:14:49 +00001711 TheModule = new Module("my cool jit", Context);
Chris Lattner6093bd52007-10-31 07:29:43 +00001712
Jeffrey Yasskinf0356fe2010-01-27 20:34:15 +00001713 // Create the JIT. This takes ownership of the module.
Jeffrey Yasskin42fc5582010-02-11 19:15:20 +00001714 std::string ErrStr;
NAKAMURA Takumi4d6deb02011-04-09 09:51:57 +00001715 TheExecutionEngine = EngineBuilder(TheModule).setErrorStr(&amp;ErrStr).create();
Jeffrey Yasskin42fc5582010-02-11 19:15:20 +00001716 if (!TheExecutionEngine) {
1717 fprintf(stderr, "Could not create ExecutionEngine: %s\n", ErrStr.c_str());
1718 exit(1);
1719 }
Chris Lattner6093bd52007-10-31 07:29:43 +00001720
Jeffrey Yasskinf0356fe2010-01-27 20:34:15 +00001721 FunctionPassManager OurFPM(TheModule);
Reid Kleckner60130f02009-08-26 20:58:25 +00001722
1723 // Set up the optimizer pipeline. Start with registering info about how the
1724 // target lays out data structures.
1725 OurFPM.add(new TargetData(*TheExecutionEngine-&gt;getTargetData()));
Dan Gohmandfa1a792010-11-15 18:41:10 +00001726 // Provide basic AliasAnalysis support for GVN.
1727 OurFPM.add(createBasicAliasAnalysisPass());
Reid Kleckner60130f02009-08-26 20:58:25 +00001728 // Do simple "peephole" optimizations and bit-twiddling optzns.
1729 OurFPM.add(createInstructionCombiningPass());
1730 // Reassociate expressions.
1731 OurFPM.add(createReassociatePass());
1732 // Eliminate Common SubExpressions.
1733 OurFPM.add(createGVNPass());
1734 // Simplify the control flow graph (deleting unreachable blocks, etc).
1735 OurFPM.add(createCFGSimplificationPass());
1736
Nick Lewycky422094c2009-09-13 21:38:54 +00001737 OurFPM.doInitialization();
1738
Reid Kleckner60130f02009-08-26 20:58:25 +00001739 // Set the global so the code gen can use this.
1740 TheFPM = &amp;OurFPM;
1741
1742 // Run the main "interpreter loop" now.
1743 MainLoop();
1744
1745 TheFPM = 0;
1746
1747 // Print out all of the generated code.
1748 TheModule-&gt;dump();
1749
Chris Lattner6093bd52007-10-31 07:29:43 +00001750 return 0;
1751}
Chris Lattner602c832c2007-10-31 06:30:21 +00001752</pre>
1753</div>
1754
Chris Lattner729eb142008-02-10 19:11:04 +00001755<a href="LangImpl6.html">Next: Extending the language: user-defined operators</a>
Chris Lattner602c832c2007-10-31 06:30:21 +00001756</div>
1757
1758<!-- *********************************************************************** -->
1759<hr>
1760<address>
1761 <a href="http://jigsaw.w3.org/css-validator/check/referer"><img
1762 src="http://jigsaw.w3.org/css-validator/images/vcss" alt="Valid CSS!"></a>
1763 <a href="http://validator.w3.org/check/referer"><img
1764 src="http://www.w3.org/Icons/valid-html401" alt="Valid HTML 4.01!"></a>
1765
1766 <a href="mailto:sabre@nondot.org">Chris Lattner</a><br>
NAKAMURA Takumib9a33632011-04-09 02:13:37 +00001767 <a href="http://llvm.org/">The LLVM Compiler Infrastructure</a><br>
Dan Gohman523e3922010-02-03 17:27:31 +00001768 Last modified: $Date$
Chris Lattner602c832c2007-10-31 06:30:21 +00001769</address>
1770</body>
1771</html>