blob: 8405e1a917c723f3db3511feb195f48b01831554 [file] [log] [blame]
Sean Silvaee47edf2012-12-05 00:26:32 +00001==================================================
2Kaleidoscope: Extending the Language: Control Flow
3==================================================
4
5.. contents::
6 :local:
7
8Written by `Chris Lattner <mailto:sabre@nondot.org>`_
9
10Chapter 5 Introduction
11======================
12
13Welcome to Chapter 5 of the "`Implementing a language with
14LLVM <index.html>`_" tutorial. Parts 1-4 described the implementation of
15the simple Kaleidoscope language and included support for generating
16LLVM IR, followed by optimizations and a JIT compiler. Unfortunately, as
17presented, Kaleidoscope is mostly useless: it has no control flow other
18than call and return. This means that you can't have conditional
19branches in the code, significantly limiting its power. In this episode
20of "build that compiler", we'll extend Kaleidoscope to have an
21if/then/else expression plus a simple 'for' loop.
22
23If/Then/Else
24============
25
26Extending Kaleidoscope to support if/then/else is quite straightforward.
27It basically requires adding support for this "new" concept to the
28lexer, parser, AST, and LLVM code emitter. This example is nice, because
29it shows how easy it is to "grow" a language over time, incrementally
30extending it as new ideas are discovered.
31
32Before we get going on "how" we add this extension, lets talk about
33"what" we want. The basic idea is that we want to be able to write this
34sort of thing:
35
36::
37
38 def fib(x)
39 if x < 3 then
40 1
41 else
42 fib(x-1)+fib(x-2);
43
44In Kaleidoscope, every construct is an expression: there are no
45statements. As such, the if/then/else expression needs to return a value
46like any other. Since we're using a mostly functional form, we'll have
47it evaluate its conditional, then return the 'then' or 'else' value
48based on how the condition was resolved. This is very similar to the C
49"?:" expression.
50
51The semantics of the if/then/else expression is that it evaluates the
52condition to a boolean equality value: 0.0 is considered to be false and
53everything else is considered to be true. If the condition is true, the
54first subexpression is evaluated and returned, if the condition is
55false, the second subexpression is evaluated and returned. Since
56Kaleidoscope allows side-effects, this behavior is important to nail
57down.
58
59Now that we know what we "want", lets break this down into its
60constituent pieces.
61
62Lexer Extensions for If/Then/Else
63---------------------------------
64
65The lexer extensions are straightforward. First we add new enum values
66for the relevant tokens:
67
68.. code-block:: c++
69
70 // control
71 tok_if = -6, tok_then = -7, tok_else = -8,
72
73Once we have that, we recognize the new keywords in the lexer. This is
74pretty simple stuff:
75
76.. code-block:: c++
77
78 ...
79 if (IdentifierStr == "def") return tok_def;
80 if (IdentifierStr == "extern") return tok_extern;
81 if (IdentifierStr == "if") return tok_if;
82 if (IdentifierStr == "then") return tok_then;
83 if (IdentifierStr == "else") return tok_else;
84 return tok_identifier;
85
86AST Extensions for If/Then/Else
87-------------------------------
88
89To represent the new expression we add a new AST node for it:
90
91.. code-block:: c++
92
93 /// IfExprAST - Expression class for if/then/else.
94 class IfExprAST : public ExprAST {
95 ExprAST *Cond, *Then, *Else;
96 public:
97 IfExprAST(ExprAST *cond, ExprAST *then, ExprAST *_else)
98 : Cond(cond), Then(then), Else(_else) {}
99 virtual Value *Codegen();
100 };
101
102The AST node just has pointers to the various subexpressions.
103
104Parser Extensions for If/Then/Else
105----------------------------------
106
107Now that we have the relevant tokens coming from the lexer and we have
108the AST node to build, our parsing logic is relatively straightforward.
109First we define a new parsing function:
110
111.. code-block:: c++
112
113 /// ifexpr ::= 'if' expression 'then' expression 'else' expression
114 static ExprAST *ParseIfExpr() {
115 getNextToken(); // eat the if.
116
117 // condition.
118 ExprAST *Cond = ParseExpression();
119 if (!Cond) return 0;
120
121 if (CurTok != tok_then)
122 return Error("expected then");
123 getNextToken(); // eat the then
124
125 ExprAST *Then = ParseExpression();
126 if (Then == 0) return 0;
127
128 if (CurTok != tok_else)
129 return Error("expected else");
130
131 getNextToken();
132
133 ExprAST *Else = ParseExpression();
134 if (!Else) return 0;
135
136 return new IfExprAST(Cond, Then, Else);
137 }
138
139Next we hook it up as a primary expression:
140
141.. code-block:: c++
142
143 static ExprAST *ParsePrimary() {
144 switch (CurTok) {
145 default: return Error("unknown token when expecting an expression");
146 case tok_identifier: return ParseIdentifierExpr();
147 case tok_number: return ParseNumberExpr();
148 case '(': return ParseParenExpr();
149 case tok_if: return ParseIfExpr();
150 }
151 }
152
153LLVM IR for If/Then/Else
154------------------------
155
156Now that we have it parsing and building the AST, the final piece is
157adding LLVM code generation support. This is the most interesting part
158of the if/then/else example, because this is where it starts to
159introduce new concepts. All of the code above has been thoroughly
160described in previous chapters.
161
162To motivate the code we want to produce, lets take a look at a simple
163example. Consider:
164
165::
166
167 extern foo();
168 extern bar();
169 def baz(x) if x then foo() else bar();
170
171If you disable optimizations, the code you'll (soon) get from
172Kaleidoscope looks like this:
173
174.. code-block:: llvm
175
176 declare double @foo()
177
178 declare double @bar()
179
180 define double @baz(double %x) {
181 entry:
182 %ifcond = fcmp one double %x, 0.000000e+00
183 br i1 %ifcond, label %then, label %else
184
185 then: ; preds = %entry
186 %calltmp = call double @foo()
187 br label %ifcont
188
189 else: ; preds = %entry
190 %calltmp1 = call double @bar()
191 br label %ifcont
192
193 ifcont: ; preds = %else, %then
194 %iftmp = phi double [ %calltmp, %then ], [ %calltmp1, %else ]
195 ret double %iftmp
196 }
197
198To visualize the control flow graph, you can use a nifty feature of the
199LLVM '`opt <http://llvm.org/cmds/opt.html>`_' tool. If you put this LLVM
200IR into "t.ll" and run "``llvm-as < t.ll | opt -analyze -view-cfg``", `a
201window will pop up <../ProgrammersManual.html#ViewGraph>`_ and you'll
202see this graph:
203
204.. figure:: LangImpl5-cfg.png
205 :align: center
206 :alt: Example CFG
207
208 Example CFG
209
210Another way to get this is to call "``F->viewCFG()``" or
211"``F->viewCFGOnly()``" (where F is a "``Function*``") either by
212inserting actual calls into the code and recompiling or by calling these
213in the debugger. LLVM has many nice features for visualizing various
214graphs.
215
216Getting back to the generated code, it is fairly simple: the entry block
217evaluates the conditional expression ("x" in our case here) and compares
218the result to 0.0 with the "``fcmp one``" instruction ('one' is "Ordered
219and Not Equal"). Based on the result of this expression, the code jumps
220to either the "then" or "else" blocks, which contain the expressions for
221the true/false cases.
222
223Once the then/else blocks are finished executing, they both branch back
224to the 'ifcont' block to execute the code that happens after the
225if/then/else. In this case the only thing left to do is to return to the
226caller of the function. The question then becomes: how does the code
227know which expression to return?
228
229The answer to this question involves an important SSA operation: the
230`Phi
231operation <http://en.wikipedia.org/wiki/Static_single_assignment_form>`_.
232If you're not familiar with SSA, `the wikipedia
233article <http://en.wikipedia.org/wiki/Static_single_assignment_form>`_
234is a good introduction and there are various other introductions to it
235available on your favorite search engine. The short version is that
236"execution" of the Phi operation requires "remembering" which block
237control came from. The Phi operation takes on the value corresponding to
238the input control block. In this case, if control comes in from the
239"then" block, it gets the value of "calltmp". If control comes from the
240"else" block, it gets the value of "calltmp1".
241
242At this point, you are probably starting to think "Oh no! This means my
243simple and elegant front-end will have to start generating SSA form in
244order to use LLVM!". Fortunately, this is not the case, and we strongly
245advise *not* implementing an SSA construction algorithm in your
246front-end unless there is an amazingly good reason to do so. In
247practice, there are two sorts of values that float around in code
248written for your average imperative programming language that might need
249Phi nodes:
250
251#. Code that involves user variables: ``x = 1; x = x + 1;``
252#. Values that are implicit in the structure of your AST, such as the
253 Phi node in this case.
254
255In `Chapter 7 <LangImpl7.html>`_ of this tutorial ("mutable variables"),
256we'll talk about #1 in depth. For now, just believe me that you don't
257need SSA construction to handle this case. For #2, you have the choice
258of using the techniques that we will describe for #1, or you can insert
259Phi nodes directly, if convenient. In this case, it is really really
260easy to generate the Phi node, so we choose to do it directly.
261
262Okay, enough of the motivation and overview, lets generate code!
263
264Code Generation for If/Then/Else
265--------------------------------
266
267In order to generate code for this, we implement the ``Codegen`` method
268for ``IfExprAST``:
269
270.. code-block:: c++
271
272 Value *IfExprAST::Codegen() {
273 Value *CondV = Cond->Codegen();
274 if (CondV == 0) return 0;
275
276 // Convert condition to a bool by comparing equal to 0.0.
277 CondV = Builder.CreateFCmpONE(CondV,
278 ConstantFP::get(getGlobalContext(), APFloat(0.0)),
279 "ifcond");
280
281This code is straightforward and similar to what we saw before. We emit
282the expression for the condition, then compare that value to zero to get
283a truth value as a 1-bit (bool) value.
284
285.. code-block:: c++
286
287 Function *TheFunction = Builder.GetInsertBlock()->getParent();
288
289 // Create blocks for the then and else cases. Insert the 'then' block at the
290 // end of the function.
291 BasicBlock *ThenBB = BasicBlock::Create(getGlobalContext(), "then", TheFunction);
292 BasicBlock *ElseBB = BasicBlock::Create(getGlobalContext(), "else");
293 BasicBlock *MergeBB = BasicBlock::Create(getGlobalContext(), "ifcont");
294
295 Builder.CreateCondBr(CondV, ThenBB, ElseBB);
296
297This code creates the basic blocks that are related to the if/then/else
298statement, and correspond directly to the blocks in the example above.
299The first line gets the current Function object that is being built. It
300gets this by asking the builder for the current BasicBlock, and asking
301that block for its "parent" (the function it is currently embedded
302into).
303
304Once it has that, it creates three blocks. Note that it passes
305"TheFunction" into the constructor for the "then" block. This causes the
306constructor to automatically insert the new block into the end of the
307specified function. The other two blocks are created, but aren't yet
308inserted into the function.
309
310Once the blocks are created, we can emit the conditional branch that
311chooses between them. Note that creating new blocks does not implicitly
312affect the IRBuilder, so it is still inserting into the block that the
313condition went into. Also note that it is creating a branch to the
314"then" block and the "else" block, even though the "else" block isn't
315inserted into the function yet. This is all ok: it is the standard way
316that LLVM supports forward references.
317
318.. code-block:: c++
319
320 // Emit then value.
321 Builder.SetInsertPoint(ThenBB);
322
323 Value *ThenV = Then->Codegen();
324 if (ThenV == 0) return 0;
325
326 Builder.CreateBr(MergeBB);
327 // Codegen of 'Then' can change the current block, update ThenBB for the PHI.
328 ThenBB = Builder.GetInsertBlock();
329
330After the conditional branch is inserted, we move the builder to start
331inserting into the "then" block. Strictly speaking, this call moves the
332insertion point to be at the end of the specified block. However, since
333the "then" block is empty, it also starts out by inserting at the
334beginning of the block. :)
335
336Once the insertion point is set, we recursively codegen the "then"
337expression from the AST. To finish off the "then" block, we create an
338unconditional branch to the merge block. One interesting (and very
339important) aspect of the LLVM IR is that it `requires all basic blocks
340to be "terminated" <../LangRef.html#functionstructure>`_ with a `control
341flow instruction <../LangRef.html#terminators>`_ such as return or
342branch. This means that all control flow, *including fall throughs* must
343be made explicit in the LLVM IR. If you violate this rule, the verifier
344will emit an error.
345
346The final line here is quite subtle, but is very important. The basic
347issue is that when we create the Phi node in the merge block, we need to
348set up the block/value pairs that indicate how the Phi will work.
349Importantly, the Phi node expects to have an entry for each predecessor
350of the block in the CFG. Why then, are we getting the current block when
351we just set it to ThenBB 5 lines above? The problem is that the "Then"
352expression may actually itself change the block that the Builder is
353emitting into if, for example, it contains a nested "if/then/else"
354expression. Because calling Codegen recursively could arbitrarily change
355the notion of the current block, we are required to get an up-to-date
356value for code that will set up the Phi node.
357
358.. code-block:: c++
359
360 // Emit else block.
361 TheFunction->getBasicBlockList().push_back(ElseBB);
362 Builder.SetInsertPoint(ElseBB);
363
364 Value *ElseV = Else->Codegen();
365 if (ElseV == 0) return 0;
366
367 Builder.CreateBr(MergeBB);
368 // Codegen of 'Else' can change the current block, update ElseBB for the PHI.
369 ElseBB = Builder.GetInsertBlock();
370
371Code generation for the 'else' block is basically identical to codegen
372for the 'then' block. The only significant difference is the first line,
373which adds the 'else' block to the function. Recall previously that the
374'else' block was created, but not added to the function. Now that the
375'then' and 'else' blocks are emitted, we can finish up with the merge
376code:
377
378.. code-block:: c++
379
380 // Emit merge block.
381 TheFunction->getBasicBlockList().push_back(MergeBB);
382 Builder.SetInsertPoint(MergeBB);
383 PHINode *PN = Builder.CreatePHI(Type::getDoubleTy(getGlobalContext()), 2,
384 "iftmp");
385
386 PN->addIncoming(ThenV, ThenBB);
387 PN->addIncoming(ElseV, ElseBB);
388 return PN;
389 }
390
391The first two lines here are now familiar: the first adds the "merge"
392block to the Function object (it was previously floating, like the else
393block above). The second block changes the insertion point so that newly
394created code will go into the "merge" block. Once that is done, we need
395to create the PHI node and set up the block/value pairs for the PHI.
396
397Finally, the CodeGen function returns the phi node as the value computed
398by the if/then/else expression. In our example above, this returned
399value will feed into the code for the top-level function, which will
400create the return instruction.
401
402Overall, we now have the ability to execute conditional code in
403Kaleidoscope. With this extension, Kaleidoscope is a fairly complete
404language that can calculate a wide variety of numeric functions. Next up
405we'll add another useful expression that is familiar from non-functional
406languages...
407
408'for' Loop Expression
409=====================
410
411Now that we know how to add basic control flow constructs to the
412language, we have the tools to add more powerful things. Lets add
413something more aggressive, a 'for' expression:
414
415::
416
417 extern putchard(char)
418 def printstar(n)
419 for i = 1, i < n, 1.0 in
420 putchard(42); # ascii 42 = '*'
421
422 # print 100 '*' characters
423 printstar(100);
424
425This expression defines a new variable ("i" in this case) which iterates
426from a starting value, while the condition ("i < n" in this case) is
427true, incrementing by an optional step value ("1.0" in this case). If
428the step value is omitted, it defaults to 1.0. While the loop is true,
429it executes its body expression. Because we don't have anything better
430to return, we'll just define the loop as always returning 0.0. In the
431future when we have mutable variables, it will get more useful.
432
433As before, lets talk about the changes that we need to Kaleidoscope to
434support this.
435
436Lexer Extensions for the 'for' Loop
437-----------------------------------
438
439The lexer extensions are the same sort of thing as for if/then/else:
440
441.. code-block:: c++
442
443 ... in enum Token ...
444 // control
445 tok_if = -6, tok_then = -7, tok_else = -8,
446 tok_for = -9, tok_in = -10
447
448 ... in gettok ...
449 if (IdentifierStr == "def") return tok_def;
450 if (IdentifierStr == "extern") return tok_extern;
451 if (IdentifierStr == "if") return tok_if;
452 if (IdentifierStr == "then") return tok_then;
453 if (IdentifierStr == "else") return tok_else;
454 if (IdentifierStr == "for") return tok_for;
455 if (IdentifierStr == "in") return tok_in;
456 return tok_identifier;
457
458AST Extensions for the 'for' Loop
459---------------------------------
460
461The AST node is just as simple. It basically boils down to capturing the
462variable name and the constituent expressions in the node.
463
464.. code-block:: c++
465
466 /// ForExprAST - Expression class for for/in.
467 class ForExprAST : public ExprAST {
468 std::string VarName;
469 ExprAST *Start, *End, *Step, *Body;
470 public:
471 ForExprAST(const std::string &varname, ExprAST *start, ExprAST *end,
472 ExprAST *step, ExprAST *body)
473 : VarName(varname), Start(start), End(end), Step(step), Body(body) {}
474 virtual Value *Codegen();
475 };
476
477Parser Extensions for the 'for' Loop
478------------------------------------
479
480The parser code is also fairly standard. The only interesting thing here
481is handling of the optional step value. The parser code handles it by
482checking to see if the second comma is present. If not, it sets the step
483value to null in the AST node:
484
485.. code-block:: c++
486
487 /// forexpr ::= 'for' identifier '=' expr ',' expr (',' expr)? 'in' expression
488 static ExprAST *ParseForExpr() {
489 getNextToken(); // eat the for.
490
491 if (CurTok != tok_identifier)
492 return Error("expected identifier after for");
493
494 std::string IdName = IdentifierStr;
495 getNextToken(); // eat identifier.
496
497 if (CurTok != '=')
498 return Error("expected '=' after for");
499 getNextToken(); // eat '='.
500
501
502 ExprAST *Start = ParseExpression();
503 if (Start == 0) return 0;
504 if (CurTok != ',')
505 return Error("expected ',' after for start value");
506 getNextToken();
507
508 ExprAST *End = ParseExpression();
509 if (End == 0) return 0;
510
511 // The step value is optional.
512 ExprAST *Step = 0;
513 if (CurTok == ',') {
514 getNextToken();
515 Step = ParseExpression();
516 if (Step == 0) return 0;
517 }
518
519 if (CurTok != tok_in)
520 return Error("expected 'in' after for");
521 getNextToken(); // eat 'in'.
522
523 ExprAST *Body = ParseExpression();
524 if (Body == 0) return 0;
525
526 return new ForExprAST(IdName, Start, End, Step, Body);
527 }
528
529LLVM IR for the 'for' Loop
530--------------------------
531
532Now we get to the good part: the LLVM IR we want to generate for this
533thing. With the simple example above, we get this LLVM IR (note that
534this dump is generated with optimizations disabled for clarity):
535
536.. code-block:: llvm
537
538 declare double @putchard(double)
539
540 define double @printstar(double %n) {
541 entry:
542 ; initial value = 1.0 (inlined into phi)
543 br label %loop
544
545 loop: ; preds = %loop, %entry
546 %i = phi double [ 1.000000e+00, %entry ], [ %nextvar, %loop ]
547 ; body
548 %calltmp = call double @putchard(double 4.200000e+01)
549 ; increment
550 %nextvar = fadd double %i, 1.000000e+00
551
552 ; termination test
553 %cmptmp = fcmp ult double %i, %n
554 %booltmp = uitofp i1 %cmptmp to double
555 %loopcond = fcmp one double %booltmp, 0.000000e+00
556 br i1 %loopcond, label %loop, label %afterloop
557
558 afterloop: ; preds = %loop
559 ; loop always returns 0.0
560 ret double 0.000000e+00
561 }
562
563This loop contains all the same constructs we saw before: a phi node,
564several expressions, and some basic blocks. Lets see how this fits
565together.
566
567Code Generation for the 'for' Loop
568----------------------------------
569
570The first part of Codegen is very simple: we just output the start
571expression for the loop value:
572
573.. code-block:: c++
574
575 Value *ForExprAST::Codegen() {
576 // Emit the start code first, without 'variable' in scope.
577 Value *StartVal = Start->Codegen();
578 if (StartVal == 0) return 0;
579
580With this out of the way, the next step is to set up the LLVM basic
581block for the start of the loop body. In the case above, the whole loop
582body is one block, but remember that the body code itself could consist
583of multiple blocks (e.g. if it contains an if/then/else or a for/in
584expression).
585
586.. code-block:: c++
587
588 // Make the new basic block for the loop header, inserting after current
589 // block.
590 Function *TheFunction = Builder.GetInsertBlock()->getParent();
591 BasicBlock *PreheaderBB = Builder.GetInsertBlock();
592 BasicBlock *LoopBB = BasicBlock::Create(getGlobalContext(), "loop", TheFunction);
593
594 // Insert an explicit fall through from the current block to the LoopBB.
595 Builder.CreateBr(LoopBB);
596
597This code is similar to what we saw for if/then/else. Because we will
598need it to create the Phi node, we remember the block that falls through
599into the loop. Once we have that, we create the actual block that starts
600the loop and create an unconditional branch for the fall-through between
601the two blocks.
602
603.. code-block:: c++
604
605 // Start insertion in LoopBB.
606 Builder.SetInsertPoint(LoopBB);
607
608 // Start the PHI node with an entry for Start.
609 PHINode *Variable = Builder.CreatePHI(Type::getDoubleTy(getGlobalContext()), 2, VarName.c_str());
610 Variable->addIncoming(StartVal, PreheaderBB);
611
612Now that the "preheader" for the loop is set up, we switch to emitting
613code for the loop body. To begin with, we move the insertion point and
614create the PHI node for the loop induction variable. Since we already
615know the incoming value for the starting value, we add it to the Phi
616node. Note that the Phi will eventually get a second value for the
617backedge, but we can't set it up yet (because it doesn't exist!).
618
619.. code-block:: c++
620
621 // Within the loop, the variable is defined equal to the PHI node. If it
622 // shadows an existing variable, we have to restore it, so save it now.
623 Value *OldVal = NamedValues[VarName];
624 NamedValues[VarName] = Variable;
625
626 // Emit the body of the loop. This, like any other expr, can change the
627 // current BB. Note that we ignore the value computed by the body, but don't
628 // allow an error.
629 if (Body->Codegen() == 0)
630 return 0;
631
632Now the code starts to get more interesting. Our 'for' loop introduces a
633new variable to the symbol table. This means that our symbol table can
634now contain either function arguments or loop variables. To handle this,
635before we codegen the body of the loop, we add the loop variable as the
636current value for its name. Note that it is possible that there is a
637variable of the same name in the outer scope. It would be easy to make
638this an error (emit an error and return null if there is already an
639entry for VarName) but we choose to allow shadowing of variables. In
640order to handle this correctly, we remember the Value that we are
641potentially shadowing in ``OldVal`` (which will be null if there is no
642shadowed variable).
643
644Once the loop variable is set into the symbol table, the code
645recursively codegen's the body. This allows the body to use the loop
646variable: any references to it will naturally find it in the symbol
647table.
648
649.. code-block:: c++
650
651 // Emit the step value.
652 Value *StepVal;
653 if (Step) {
654 StepVal = Step->Codegen();
655 if (StepVal == 0) return 0;
656 } else {
657 // If not specified, use 1.0.
658 StepVal = ConstantFP::get(getGlobalContext(), APFloat(1.0));
659 }
660
661 Value *NextVar = Builder.CreateFAdd(Variable, StepVal, "nextvar");
662
663Now that the body is emitted, we compute the next value of the iteration
664variable by adding the step value, or 1.0 if it isn't present.
665'``NextVar``' will be the value of the loop variable on the next
666iteration of the loop.
667
668.. code-block:: c++
669
670 // Compute the end condition.
671 Value *EndCond = End->Codegen();
672 if (EndCond == 0) return EndCond;
673
674 // Convert condition to a bool by comparing equal to 0.0.
675 EndCond = Builder.CreateFCmpONE(EndCond,
676 ConstantFP::get(getGlobalContext(), APFloat(0.0)),
677 "loopcond");
678
679Finally, we evaluate the exit value of the loop, to determine whether
680the loop should exit. This mirrors the condition evaluation for the
681if/then/else statement.
682
683.. code-block:: c++
684
685 // Create the "after loop" block and insert it.
686 BasicBlock *LoopEndBB = Builder.GetInsertBlock();
687 BasicBlock *AfterBB = BasicBlock::Create(getGlobalContext(), "afterloop", TheFunction);
688
689 // Insert the conditional branch into the end of LoopEndBB.
690 Builder.CreateCondBr(EndCond, LoopBB, AfterBB);
691
692 // Any new code will be inserted in AfterBB.
693 Builder.SetInsertPoint(AfterBB);
694
695With the code for the body of the loop complete, we just need to finish
696up the control flow for it. This code remembers the end block (for the
697phi node), then creates the block for the loop exit ("afterloop"). Based
698on the value of the exit condition, it creates a conditional branch that
699chooses between executing the loop again and exiting the loop. Any
700future code is emitted in the "afterloop" block, so it sets the
701insertion position to it.
702
703.. code-block:: c++
704
705 // Add a new entry to the PHI node for the backedge.
706 Variable->addIncoming(NextVar, LoopEndBB);
707
708 // Restore the unshadowed variable.
709 if (OldVal)
710 NamedValues[VarName] = OldVal;
711 else
712 NamedValues.erase(VarName);
713
714 // for expr always returns 0.0.
715 return Constant::getNullValue(Type::getDoubleTy(getGlobalContext()));
716 }
717
718The final code handles various cleanups: now that we have the "NextVar"
719value, we can add the incoming value to the loop PHI node. After that,
720we remove the loop variable from the symbol table, so that it isn't in
721scope after the for loop. Finally, code generation of the for loop
722always returns 0.0, so that is what we return from
723``ForExprAST::Codegen``.
724
725With this, we conclude the "adding control flow to Kaleidoscope" chapter
726of the tutorial. In this chapter we added two control flow constructs,
727and used them to motivate a couple of aspects of the LLVM IR that are
728important for front-end implementors to know. In the next chapter of our
729saga, we will get a bit crazier and add `user-defined
730operators <LangImpl6.html>`_ to our poor innocent language.
731
732Full Code Listing
733=================
734
735Here is the complete code listing for our running example, enhanced with
736the if/then/else and for expressions.. To build this example, use:
737
738.. code-block:: bash
739
740 # Compile
741 clang++ -g toy.cpp `llvm-config --cppflags --ldflags --libs core jit native` -O3 -o toy
742 # Run
743 ./toy
744
745Here is the code:
746
747.. code-block:: c++
748
749 #include "llvm/DerivedTypes.h"
750 #include "llvm/ExecutionEngine/ExecutionEngine.h"
751 #include "llvm/ExecutionEngine/JIT.h"
752 #include "llvm/IRBuilder.h"
753 #include "llvm/LLVMContext.h"
754 #include "llvm/Module.h"
755 #include "llvm/PassManager.h"
756 #include "llvm/Analysis/Verifier.h"
757 #include "llvm/Analysis/Passes.h"
758 #include "llvm/DataLayout.h"
759 #include "llvm/Transforms/Scalar.h"
760 #include "llvm/Support/TargetSelect.h"
761 #include <cstdio>
762 #include <string>
763 #include <map>
764 #include <vector>
765 using namespace llvm;
766
767 //===----------------------------------------------------------------------===//
768 // Lexer
769 //===----------------------------------------------------------------------===//
770
771 // The lexer returns tokens [0-255] if it is an unknown character, otherwise one
772 // of these for known things.
773 enum Token {
774 tok_eof = -1,
775
776 // commands
777 tok_def = -2, tok_extern = -3,
778
779 // primary
780 tok_identifier = -4, tok_number = -5,
781
782 // control
783 tok_if = -6, tok_then = -7, tok_else = -8,
784 tok_for = -9, tok_in = -10
785 };
786
787 static std::string IdentifierStr; // Filled in if tok_identifier
788 static double NumVal; // Filled in if tok_number
789
790 /// gettok - Return the next token from standard input.
791 static int gettok() {
792 static int LastChar = ' ';
793
794 // Skip any whitespace.
795 while (isspace(LastChar))
796 LastChar = getchar();
797
798 if (isalpha(LastChar)) { // identifier: [a-zA-Z][a-zA-Z0-9]*
799 IdentifierStr = LastChar;
800 while (isalnum((LastChar = getchar())))
801 IdentifierStr += LastChar;
802
803 if (IdentifierStr == "def") return tok_def;
804 if (IdentifierStr == "extern") return tok_extern;
805 if (IdentifierStr == "if") return tok_if;
806 if (IdentifierStr == "then") return tok_then;
807 if (IdentifierStr == "else") return tok_else;
808 if (IdentifierStr == "for") return tok_for;
809 if (IdentifierStr == "in") return tok_in;
810 return tok_identifier;
811 }
812
813 if (isdigit(LastChar) || LastChar == '.') { // Number: [0-9.]+
814 std::string NumStr;
815 do {
816 NumStr += LastChar;
817 LastChar = getchar();
818 } while (isdigit(LastChar) || LastChar == '.');
819
820 NumVal = strtod(NumStr.c_str(), 0);
821 return tok_number;
822 }
823
824 if (LastChar == '#') {
825 // Comment until end of line.
826 do LastChar = getchar();
827 while (LastChar != EOF && LastChar != '\n' && LastChar != '\r');
828
829 if (LastChar != EOF)
830 return gettok();
831 }
832
833 // Check for end of file. Don't eat the EOF.
834 if (LastChar == EOF)
835 return tok_eof;
836
837 // Otherwise, just return the character as its ascii value.
838 int ThisChar = LastChar;
839 LastChar = getchar();
840 return ThisChar;
841 }
842
843 //===----------------------------------------------------------------------===//
844 // Abstract Syntax Tree (aka Parse Tree)
845 //===----------------------------------------------------------------------===//
846
847 /// ExprAST - Base class for all expression nodes.
848 class ExprAST {
849 public:
850 virtual ~ExprAST() {}
851 virtual Value *Codegen() = 0;
852 };
853
854 /// NumberExprAST - Expression class for numeric literals like "1.0".
855 class NumberExprAST : public ExprAST {
856 double Val;
857 public:
858 NumberExprAST(double val) : Val(val) {}
859 virtual Value *Codegen();
860 };
861
862 /// VariableExprAST - Expression class for referencing a variable, like "a".
863 class VariableExprAST : public ExprAST {
864 std::string Name;
865 public:
866 VariableExprAST(const std::string &name) : Name(name) {}
867 virtual Value *Codegen();
868 };
869
870 /// BinaryExprAST - Expression class for a binary operator.
871 class BinaryExprAST : public ExprAST {
872 char Op;
873 ExprAST *LHS, *RHS;
874 public:
875 BinaryExprAST(char op, ExprAST *lhs, ExprAST *rhs)
876 : Op(op), LHS(lhs), RHS(rhs) {}
877 virtual Value *Codegen();
878 };
879
880 /// CallExprAST - Expression class for function calls.
881 class CallExprAST : public ExprAST {
882 std::string Callee;
883 std::vector<ExprAST*> Args;
884 public:
885 CallExprAST(const std::string &callee, std::vector<ExprAST*> &args)
886 : Callee(callee), Args(args) {}
887 virtual Value *Codegen();
888 };
889
890 /// IfExprAST - Expression class for if/then/else.
891 class IfExprAST : public ExprAST {
892 ExprAST *Cond, *Then, *Else;
893 public:
894 IfExprAST(ExprAST *cond, ExprAST *then, ExprAST *_else)
895 : Cond(cond), Then(then), Else(_else) {}
896 virtual Value *Codegen();
897 };
898
899 /// ForExprAST - Expression class for for/in.
900 class ForExprAST : public ExprAST {
901 std::string VarName;
902 ExprAST *Start, *End, *Step, *Body;
903 public:
904 ForExprAST(const std::string &varname, ExprAST *start, ExprAST *end,
905 ExprAST *step, ExprAST *body)
906 : VarName(varname), Start(start), End(end), Step(step), Body(body) {}
907 virtual Value *Codegen();
908 };
909
910 /// PrototypeAST - This class represents the "prototype" for a function,
911 /// which captures its name, and its argument names (thus implicitly the number
912 /// of arguments the function takes).
913 class PrototypeAST {
914 std::string Name;
915 std::vector<std::string> Args;
916 public:
917 PrototypeAST(const std::string &name, const std::vector<std::string> &args)
918 : Name(name), Args(args) {}
919
920 Function *Codegen();
921 };
922
923 /// FunctionAST - This class represents a function definition itself.
924 class FunctionAST {
925 PrototypeAST *Proto;
926 ExprAST *Body;
927 public:
928 FunctionAST(PrototypeAST *proto, ExprAST *body)
929 : Proto(proto), Body(body) {}
930
931 Function *Codegen();
932 };
933
934 //===----------------------------------------------------------------------===//
935 // Parser
936 //===----------------------------------------------------------------------===//
937
938 /// CurTok/getNextToken - Provide a simple token buffer. CurTok is the current
939 /// token the parser is looking at. getNextToken reads another token from the
940 /// lexer and updates CurTok with its results.
941 static int CurTok;
942 static int getNextToken() {
943 return CurTok = gettok();
944 }
945
946 /// BinopPrecedence - This holds the precedence for each binary operator that is
947 /// defined.
948 static std::map<char, int> BinopPrecedence;
949
950 /// GetTokPrecedence - Get the precedence of the pending binary operator token.
951 static int GetTokPrecedence() {
952 if (!isascii(CurTok))
953 return -1;
954
955 // Make sure it's a declared binop.
956 int TokPrec = BinopPrecedence[CurTok];
957 if (TokPrec <= 0) return -1;
958 return TokPrec;
959 }
960
961 /// Error* - These are little helper functions for error handling.
962 ExprAST *Error(const char *Str) { fprintf(stderr, "Error: %s\n", Str);return 0;}
963 PrototypeAST *ErrorP(const char *Str) { Error(Str); return 0; }
964 FunctionAST *ErrorF(const char *Str) { Error(Str); return 0; }
965
966 static ExprAST *ParseExpression();
967
968 /// identifierexpr
969 /// ::= identifier
970 /// ::= identifier '(' expression* ')'
971 static ExprAST *ParseIdentifierExpr() {
972 std::string IdName = IdentifierStr;
973
974 getNextToken(); // eat identifier.
975
976 if (CurTok != '(') // Simple variable ref.
977 return new VariableExprAST(IdName);
978
979 // Call.
980 getNextToken(); // eat (
981 std::vector<ExprAST*> Args;
982 if (CurTok != ')') {
983 while (1) {
984 ExprAST *Arg = ParseExpression();
985 if (!Arg) return 0;
986 Args.push_back(Arg);
987
988 if (CurTok == ')') break;
989
990 if (CurTok != ',')
991 return Error("Expected ')' or ',' in argument list");
992 getNextToken();
993 }
994 }
995
996 // Eat the ')'.
997 getNextToken();
998
999 return new CallExprAST(IdName, Args);
1000 }
1001
1002 /// numberexpr ::= number
1003 static ExprAST *ParseNumberExpr() {
1004 ExprAST *Result = new NumberExprAST(NumVal);
1005 getNextToken(); // consume the number
1006 return Result;
1007 }
1008
1009 /// parenexpr ::= '(' expression ')'
1010 static ExprAST *ParseParenExpr() {
1011 getNextToken(); // eat (.
1012 ExprAST *V = ParseExpression();
1013 if (!V) return 0;
1014
1015 if (CurTok != ')')
1016 return Error("expected ')'");
1017 getNextToken(); // eat ).
1018 return V;
1019 }
1020
1021 /// ifexpr ::= 'if' expression 'then' expression 'else' expression
1022 static ExprAST *ParseIfExpr() {
1023 getNextToken(); // eat the if.
1024
1025 // condition.
1026 ExprAST *Cond = ParseExpression();
1027 if (!Cond) return 0;
1028
1029 if (CurTok != tok_then)
1030 return Error("expected then");
1031 getNextToken(); // eat the then
1032
1033 ExprAST *Then = ParseExpression();
1034 if (Then == 0) return 0;
1035
1036 if (CurTok != tok_else)
1037 return Error("expected else");
1038
1039 getNextToken();
1040
1041 ExprAST *Else = ParseExpression();
1042 if (!Else) return 0;
1043
1044 return new IfExprAST(Cond, Then, Else);
1045 }
1046
1047 /// forexpr ::= 'for' identifier '=' expr ',' expr (',' expr)? 'in' expression
1048 static ExprAST *ParseForExpr() {
1049 getNextToken(); // eat the for.
1050
1051 if (CurTok != tok_identifier)
1052 return Error("expected identifier after for");
1053
1054 std::string IdName = IdentifierStr;
1055 getNextToken(); // eat identifier.
1056
1057 if (CurTok != '=')
1058 return Error("expected '=' after for");
1059 getNextToken(); // eat '='.
1060
1061
1062 ExprAST *Start = ParseExpression();
1063 if (Start == 0) return 0;
1064 if (CurTok != ',')
1065 return Error("expected ',' after for start value");
1066 getNextToken();
1067
1068 ExprAST *End = ParseExpression();
1069 if (End == 0) return 0;
1070
1071 // The step value is optional.
1072 ExprAST *Step = 0;
1073 if (CurTok == ',') {
1074 getNextToken();
1075 Step = ParseExpression();
1076 if (Step == 0) return 0;
1077 }
1078
1079 if (CurTok != tok_in)
1080 return Error("expected 'in' after for");
1081 getNextToken(); // eat 'in'.
1082
1083 ExprAST *Body = ParseExpression();
1084 if (Body == 0) return 0;
1085
1086 return new ForExprAST(IdName, Start, End, Step, Body);
1087 }
1088
1089 /// primary
1090 /// ::= identifierexpr
1091 /// ::= numberexpr
1092 /// ::= parenexpr
1093 /// ::= ifexpr
1094 /// ::= forexpr
1095 static ExprAST *ParsePrimary() {
1096 switch (CurTok) {
1097 default: return Error("unknown token when expecting an expression");
1098 case tok_identifier: return ParseIdentifierExpr();
1099 case tok_number: return ParseNumberExpr();
1100 case '(': return ParseParenExpr();
1101 case tok_if: return ParseIfExpr();
1102 case tok_for: return ParseForExpr();
1103 }
1104 }
1105
1106 /// binoprhs
1107 /// ::= ('+' primary)*
1108 static ExprAST *ParseBinOpRHS(int ExprPrec, ExprAST *LHS) {
1109 // If this is a binop, find its precedence.
1110 while (1) {
1111 int TokPrec = GetTokPrecedence();
1112
1113 // If this is a binop that binds at least as tightly as the current binop,
1114 // consume it, otherwise we are done.
1115 if (TokPrec < ExprPrec)
1116 return LHS;
1117
1118 // Okay, we know this is a binop.
1119 int BinOp = CurTok;
1120 getNextToken(); // eat binop
1121
1122 // Parse the primary expression after the binary operator.
1123 ExprAST *RHS = ParsePrimary();
1124 if (!RHS) return 0;
1125
1126 // If BinOp binds less tightly with RHS than the operator after RHS, let
1127 // the pending operator take RHS as its LHS.
1128 int NextPrec = GetTokPrecedence();
1129 if (TokPrec < NextPrec) {
1130 RHS = ParseBinOpRHS(TokPrec+1, RHS);
1131 if (RHS == 0) return 0;
1132 }
1133
1134 // Merge LHS/RHS.
1135 LHS = new BinaryExprAST(BinOp, LHS, RHS);
1136 }
1137 }
1138
1139 /// expression
1140 /// ::= primary binoprhs
1141 ///
1142 static ExprAST *ParseExpression() {
1143 ExprAST *LHS = ParsePrimary();
1144 if (!LHS) return 0;
1145
1146 return ParseBinOpRHS(0, LHS);
1147 }
1148
1149 /// prototype
1150 /// ::= id '(' id* ')'
1151 static PrototypeAST *ParsePrototype() {
1152 if (CurTok != tok_identifier)
1153 return ErrorP("Expected function name in prototype");
1154
1155 std::string FnName = IdentifierStr;
1156 getNextToken();
1157
1158 if (CurTok != '(')
1159 return ErrorP("Expected '(' in prototype");
1160
1161 std::vector<std::string> ArgNames;
1162 while (getNextToken() == tok_identifier)
1163 ArgNames.push_back(IdentifierStr);
1164 if (CurTok != ')')
1165 return ErrorP("Expected ')' in prototype");
1166
1167 // success.
1168 getNextToken(); // eat ')'.
1169
1170 return new PrototypeAST(FnName, ArgNames);
1171 }
1172
1173 /// definition ::= 'def' prototype expression
1174 static FunctionAST *ParseDefinition() {
1175 getNextToken(); // eat def.
1176 PrototypeAST *Proto = ParsePrototype();
1177 if (Proto == 0) return 0;
1178
1179 if (ExprAST *E = ParseExpression())
1180 return new FunctionAST(Proto, E);
1181 return 0;
1182 }
1183
1184 /// toplevelexpr ::= expression
1185 static FunctionAST *ParseTopLevelExpr() {
1186 if (ExprAST *E = ParseExpression()) {
1187 // Make an anonymous proto.
1188 PrototypeAST *Proto = new PrototypeAST("", std::vector<std::string>());
1189 return new FunctionAST(Proto, E);
1190 }
1191 return 0;
1192 }
1193
1194 /// external ::= 'extern' prototype
1195 static PrototypeAST *ParseExtern() {
1196 getNextToken(); // eat extern.
1197 return ParsePrototype();
1198 }
1199
1200 //===----------------------------------------------------------------------===//
1201 // Code Generation
1202 //===----------------------------------------------------------------------===//
1203
1204 static Module *TheModule;
1205 static IRBuilder<> Builder(getGlobalContext());
1206 static std::map<std::string, Value*> NamedValues;
1207 static FunctionPassManager *TheFPM;
1208
1209 Value *ErrorV(const char *Str) { Error(Str); return 0; }
1210
1211 Value *NumberExprAST::Codegen() {
1212 return ConstantFP::get(getGlobalContext(), APFloat(Val));
1213 }
1214
1215 Value *VariableExprAST::Codegen() {
1216 // Look this variable up in the function.
1217 Value *V = NamedValues[Name];
1218 return V ? V : ErrorV("Unknown variable name");
1219 }
1220
1221 Value *BinaryExprAST::Codegen() {
1222 Value *L = LHS->Codegen();
1223 Value *R = RHS->Codegen();
1224 if (L == 0 || R == 0) return 0;
1225
1226 switch (Op) {
1227 case '+': return Builder.CreateFAdd(L, R, "addtmp");
1228 case '-': return Builder.CreateFSub(L, R, "subtmp");
1229 case '*': return Builder.CreateFMul(L, R, "multmp");
1230 case '<':
1231 L = Builder.CreateFCmpULT(L, R, "cmptmp");
1232 // Convert bool 0/1 to double 0.0 or 1.0
1233 return Builder.CreateUIToFP(L, Type::getDoubleTy(getGlobalContext()),
1234 "booltmp");
1235 default: return ErrorV("invalid binary operator");
1236 }
1237 }
1238
1239 Value *CallExprAST::Codegen() {
1240 // Look up the name in the global module table.
1241 Function *CalleeF = TheModule->getFunction(Callee);
1242 if (CalleeF == 0)
1243 return ErrorV("Unknown function referenced");
1244
1245 // If argument mismatch error.
1246 if (CalleeF->arg_size() != Args.size())
1247 return ErrorV("Incorrect # arguments passed");
1248
1249 std::vector<Value*> ArgsV;
1250 for (unsigned i = 0, e = Args.size(); i != e; ++i) {
1251 ArgsV.push_back(Args[i]->Codegen());
1252 if (ArgsV.back() == 0) return 0;
1253 }
1254
1255 return Builder.CreateCall(CalleeF, ArgsV, "calltmp");
1256 }
1257
1258 Value *IfExprAST::Codegen() {
1259 Value *CondV = Cond->Codegen();
1260 if (CondV == 0) return 0;
1261
1262 // Convert condition to a bool by comparing equal to 0.0.
1263 CondV = Builder.CreateFCmpONE(CondV,
1264 ConstantFP::get(getGlobalContext(), APFloat(0.0)),
1265 "ifcond");
1266
1267 Function *TheFunction = Builder.GetInsertBlock()->getParent();
1268
1269 // Create blocks for the then and else cases. Insert the 'then' block at the
1270 // end of the function.
1271 BasicBlock *ThenBB = BasicBlock::Create(getGlobalContext(), "then", TheFunction);
1272 BasicBlock *ElseBB = BasicBlock::Create(getGlobalContext(), "else");
1273 BasicBlock *MergeBB = BasicBlock::Create(getGlobalContext(), "ifcont");
1274
1275 Builder.CreateCondBr(CondV, ThenBB, ElseBB);
1276
1277 // Emit then value.
1278 Builder.SetInsertPoint(ThenBB);
1279
1280 Value *ThenV = Then->Codegen();
1281 if (ThenV == 0) return 0;
1282
1283 Builder.CreateBr(MergeBB);
1284 // Codegen of 'Then' can change the current block, update ThenBB for the PHI.
1285 ThenBB = Builder.GetInsertBlock();
1286
1287 // Emit else block.
1288 TheFunction->getBasicBlockList().push_back(ElseBB);
1289 Builder.SetInsertPoint(ElseBB);
1290
1291 Value *ElseV = Else->Codegen();
1292 if (ElseV == 0) return 0;
1293
1294 Builder.CreateBr(MergeBB);
1295 // Codegen of 'Else' can change the current block, update ElseBB for the PHI.
1296 ElseBB = Builder.GetInsertBlock();
1297
1298 // Emit merge block.
1299 TheFunction->getBasicBlockList().push_back(MergeBB);
1300 Builder.SetInsertPoint(MergeBB);
1301 PHINode *PN = Builder.CreatePHI(Type::getDoubleTy(getGlobalContext()), 2,
1302 "iftmp");
1303
1304 PN->addIncoming(ThenV, ThenBB);
1305 PN->addIncoming(ElseV, ElseBB);
1306 return PN;
1307 }
1308
1309 Value *ForExprAST::Codegen() {
1310 // Output this as:
1311 // ...
1312 // start = startexpr
1313 // goto loop
1314 // loop:
1315 // variable = phi [start, loopheader], [nextvariable, loopend]
1316 // ...
1317 // bodyexpr
1318 // ...
1319 // loopend:
1320 // step = stepexpr
1321 // nextvariable = variable + step
1322 // endcond = endexpr
1323 // br endcond, loop, endloop
1324 // outloop:
1325
1326 // Emit the start code first, without 'variable' in scope.
1327 Value *StartVal = Start->Codegen();
1328 if (StartVal == 0) return 0;
1329
1330 // Make the new basic block for the loop header, inserting after current
1331 // block.
1332 Function *TheFunction = Builder.GetInsertBlock()->getParent();
1333 BasicBlock *PreheaderBB = Builder.GetInsertBlock();
1334 BasicBlock *LoopBB = BasicBlock::Create(getGlobalContext(), "loop", TheFunction);
1335
1336 // Insert an explicit fall through from the current block to the LoopBB.
1337 Builder.CreateBr(LoopBB);
1338
1339 // Start insertion in LoopBB.
1340 Builder.SetInsertPoint(LoopBB);
1341
1342 // Start the PHI node with an entry for Start.
1343 PHINode *Variable = Builder.CreatePHI(Type::getDoubleTy(getGlobalContext()), 2, VarName.c_str());
1344 Variable->addIncoming(StartVal, PreheaderBB);
1345
1346 // Within the loop, the variable is defined equal to the PHI node. If it
1347 // shadows an existing variable, we have to restore it, so save it now.
1348 Value *OldVal = NamedValues[VarName];
1349 NamedValues[VarName] = Variable;
1350
1351 // Emit the body of the loop. This, like any other expr, can change the
1352 // current BB. Note that we ignore the value computed by the body, but don't
1353 // allow an error.
1354 if (Body->Codegen() == 0)
1355 return 0;
1356
1357 // Emit the step value.
1358 Value *StepVal;
1359 if (Step) {
1360 StepVal = Step->Codegen();
1361 if (StepVal == 0) return 0;
1362 } else {
1363 // If not specified, use 1.0.
1364 StepVal = ConstantFP::get(getGlobalContext(), APFloat(1.0));
1365 }
1366
1367 Value *NextVar = Builder.CreateFAdd(Variable, StepVal, "nextvar");
1368
1369 // Compute the end condition.
1370 Value *EndCond = End->Codegen();
1371 if (EndCond == 0) return EndCond;
1372
1373 // Convert condition to a bool by comparing equal to 0.0.
1374 EndCond = Builder.CreateFCmpONE(EndCond,
1375 ConstantFP::get(getGlobalContext(), APFloat(0.0)),
1376 "loopcond");
1377
1378 // Create the "after loop" block and insert it.
1379 BasicBlock *LoopEndBB = Builder.GetInsertBlock();
1380 BasicBlock *AfterBB = BasicBlock::Create(getGlobalContext(), "afterloop", TheFunction);
1381
1382 // Insert the conditional branch into the end of LoopEndBB.
1383 Builder.CreateCondBr(EndCond, LoopBB, AfterBB);
1384
1385 // Any new code will be inserted in AfterBB.
1386 Builder.SetInsertPoint(AfterBB);
1387
1388 // Add a new entry to the PHI node for the backedge.
1389 Variable->addIncoming(NextVar, LoopEndBB);
1390
1391 // Restore the unshadowed variable.
1392 if (OldVal)
1393 NamedValues[VarName] = OldVal;
1394 else
1395 NamedValues.erase(VarName);
1396
1397
1398 // for expr always returns 0.0.
1399 return Constant::getNullValue(Type::getDoubleTy(getGlobalContext()));
1400 }
1401
1402 Function *PrototypeAST::Codegen() {
1403 // Make the function type: double(double,double) etc.
1404 std::vector<Type*> Doubles(Args.size(),
1405 Type::getDoubleTy(getGlobalContext()));
1406 FunctionType *FT = FunctionType::get(Type::getDoubleTy(getGlobalContext()),
1407 Doubles, false);
1408
1409 Function *F = Function::Create(FT, Function::ExternalLinkage, Name, TheModule);
1410
1411 // If F conflicted, there was already something named 'Name'. If it has a
1412 // body, don't allow redefinition or reextern.
1413 if (F->getName() != Name) {
1414 // Delete the one we just made and get the existing one.
1415 F->eraseFromParent();
1416 F = TheModule->getFunction(Name);
1417
1418 // If F already has a body, reject this.
1419 if (!F->empty()) {
1420 ErrorF("redefinition of function");
1421 return 0;
1422 }
1423
1424 // If F took a different number of args, reject.
1425 if (F->arg_size() != Args.size()) {
1426 ErrorF("redefinition of function with different # args");
1427 return 0;
1428 }
1429 }
1430
1431 // Set names for all arguments.
1432 unsigned Idx = 0;
1433 for (Function::arg_iterator AI = F->arg_begin(); Idx != Args.size();
1434 ++AI, ++Idx) {
1435 AI->setName(Args[Idx]);
1436
1437 // Add arguments to variable symbol table.
1438 NamedValues[Args[Idx]] = AI;
1439 }
1440
1441 return F;
1442 }
1443
1444 Function *FunctionAST::Codegen() {
1445 NamedValues.clear();
1446
1447 Function *TheFunction = Proto->Codegen();
1448 if (TheFunction == 0)
1449 return 0;
1450
1451 // Create a new basic block to start insertion into.
1452 BasicBlock *BB = BasicBlock::Create(getGlobalContext(), "entry", TheFunction);
1453 Builder.SetInsertPoint(BB);
1454
1455 if (Value *RetVal = Body->Codegen()) {
1456 // Finish off the function.
1457 Builder.CreateRet(RetVal);
1458
1459 // Validate the generated code, checking for consistency.
1460 verifyFunction(*TheFunction);
1461
1462 // Optimize the function.
1463 TheFPM->run(*TheFunction);
1464
1465 return TheFunction;
1466 }
1467
1468 // Error reading body, remove function.
1469 TheFunction->eraseFromParent();
1470 return 0;
1471 }
1472
1473 //===----------------------------------------------------------------------===//
1474 // Top-Level parsing and JIT Driver
1475 //===----------------------------------------------------------------------===//
1476
1477 static ExecutionEngine *TheExecutionEngine;
1478
1479 static void HandleDefinition() {
1480 if (FunctionAST *F = ParseDefinition()) {
1481 if (Function *LF = F->Codegen()) {
1482 fprintf(stderr, "Read function definition:");
1483 LF->dump();
1484 }
1485 } else {
1486 // Skip token for error recovery.
1487 getNextToken();
1488 }
1489 }
1490
1491 static void HandleExtern() {
1492 if (PrototypeAST *P = ParseExtern()) {
1493 if (Function *F = P->Codegen()) {
1494 fprintf(stderr, "Read extern: ");
1495 F->dump();
1496 }
1497 } else {
1498 // Skip token for error recovery.
1499 getNextToken();
1500 }
1501 }
1502
1503 static void HandleTopLevelExpression() {
1504 // Evaluate a top-level expression into an anonymous function.
1505 if (FunctionAST *F = ParseTopLevelExpr()) {
1506 if (Function *LF = F->Codegen()) {
1507 // JIT the function, returning a function pointer.
1508 void *FPtr = TheExecutionEngine->getPointerToFunction(LF);
1509
1510 // Cast it to the right type (takes no arguments, returns a double) so we
1511 // can call it as a native function.
1512 double (*FP)() = (double (*)())(intptr_t)FPtr;
1513 fprintf(stderr, "Evaluated to %f\n", FP());
1514 }
1515 } else {
1516 // Skip token for error recovery.
1517 getNextToken();
1518 }
1519 }
1520
1521 /// top ::= definition | external | expression | ';'
1522 static void MainLoop() {
1523 while (1) {
1524 fprintf(stderr, "ready> ");
1525 switch (CurTok) {
1526 case tok_eof: return;
1527 case ';': getNextToken(); break; // ignore top-level semicolons.
1528 case tok_def: HandleDefinition(); break;
1529 case tok_extern: HandleExtern(); break;
1530 default: HandleTopLevelExpression(); break;
1531 }
1532 }
1533 }
1534
1535 //===----------------------------------------------------------------------===//
1536 // "Library" functions that can be "extern'd" from user code.
1537 //===----------------------------------------------------------------------===//
1538
1539 /// putchard - putchar that takes a double and returns 0.
1540 extern "C"
1541 double putchard(double X) {
1542 putchar((char)X);
1543 return 0;
1544 }
1545
1546 //===----------------------------------------------------------------------===//
1547 // Main driver code.
1548 //===----------------------------------------------------------------------===//
1549
1550 int main() {
1551 InitializeNativeTarget();
1552 LLVMContext &Context = getGlobalContext();
1553
1554 // Install standard binary operators.
1555 // 1 is lowest precedence.
1556 BinopPrecedence['<'] = 10;
1557 BinopPrecedence['+'] = 20;
1558 BinopPrecedence['-'] = 20;
1559 BinopPrecedence['*'] = 40; // highest.
1560
1561 // Prime the first token.
1562 fprintf(stderr, "ready> ");
1563 getNextToken();
1564
1565 // Make the module, which holds all the code.
1566 TheModule = new Module("my cool jit", Context);
1567
1568 // Create the JIT. This takes ownership of the module.
1569 std::string ErrStr;
1570 TheExecutionEngine = EngineBuilder(TheModule).setErrorStr(&ErrStr).create();
1571 if (!TheExecutionEngine) {
1572 fprintf(stderr, "Could not create ExecutionEngine: %s\n", ErrStr.c_str());
1573 exit(1);
1574 }
1575
1576 FunctionPassManager OurFPM(TheModule);
1577
1578 // Set up the optimizer pipeline. Start with registering info about how the
1579 // target lays out data structures.
1580 OurFPM.add(new DataLayout(*TheExecutionEngine->getDataLayout()));
1581 // Provide basic AliasAnalysis support for GVN.
1582 OurFPM.add(createBasicAliasAnalysisPass());
1583 // Do simple "peephole" optimizations and bit-twiddling optzns.
1584 OurFPM.add(createInstructionCombiningPass());
1585 // Reassociate expressions.
1586 OurFPM.add(createReassociatePass());
1587 // Eliminate Common SubExpressions.
1588 OurFPM.add(createGVNPass());
1589 // Simplify the control flow graph (deleting unreachable blocks, etc).
1590 OurFPM.add(createCFGSimplificationPass());
1591
1592 OurFPM.doInitialization();
1593
1594 // Set the global so the code gen can use this.
1595 TheFPM = &OurFPM;
1596
1597 // Run the main "interpreter loop" now.
1598 MainLoop();
1599
1600 TheFPM = 0;
1601
1602 // Print out all of the generated code.
1603 TheModule->dump();
1604
1605 return 0;
1606 }
1607
1608`Next: Extending the language: user-defined operators <LangImpl6.html>`_
1609