blob: 5a184c2c1817d466de465af432bf12dc3bc9faa0 [file] [log] [blame]
Chris Lattnerc9d5d2c2007-11-01 06:49:54 +00001<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01//EN"
2 "http://www.w3.org/TR/html4/strict.dtd">
3
4<html>
5<head>
Chris Lattner58f2c872007-11-02 05:42:52 +00006 <title>Kaleidoscope: Extending the Language: User-defined Operators</title>
Chris Lattnerc9d5d2c2007-11-01 06:49:54 +00007 <meta http-equiv="Content-Type" content="text/html; charset=utf-8">
8 <meta name="author" content="Chris Lattner">
9 <link rel="stylesheet" href="../llvm.css" type="text/css">
10</head>
11
12<body>
13
Chris Lattner58f2c872007-11-02 05:42:52 +000014<div class="doc_title">Kaleidoscope: Extending the Language: User-defined Operators</div>
Chris Lattnerc9d5d2c2007-11-01 06:49:54 +000015
16<div class="doc_author">
17 <p>Written by <a href="mailto:sabre@nondot.org">Chris Lattner</a></p>
18</div>
19
20<!-- *********************************************************************** -->
21<div class="doc_section"><a name="intro">Part 6 Introduction</a></div>
22<!-- *********************************************************************** -->
23
24<div class="doc_text">
25
26<p>Welcome to Part 6 of the "<a href="index.html">Implementing a language with
27LLVM</a>" tutorial. At this point in our tutorial, we now have a fully
28functional language that is fairly minimal, but also useful. One big problem
29with it though is that it doesn't have many useful operators (like division,
30logical negation, or even any comparisons other than less-than.</p>
31
Chris Lattner58f2c872007-11-02 05:42:52 +000032<p>This chapter of the tutorial takes a wild digression into adding user-defined
33operators to the simple and beautiful Kaleidoscope language, giving us a
Chris Lattnerc9d5d2c2007-11-01 06:49:54 +000034simple and ugly language in some ways, but also a powerful one at the same time.
35One of the great things about creating your own language is that you get to
36decide what is good or bad. In this tutorial we'll assume that it is okay and
37use this as a way to show some interesting parsing techniques.</p>
38
Chris Lattner58f2c872007-11-02 05:42:52 +000039<p>At the end of this tutorial, we'll <a href="#example">run through a nice
40little example</a> that shows an example application that you can build with
41Kaleidoscope and the feature set it now has.</p>
42
Chris Lattnerc9d5d2c2007-11-01 06:49:54 +000043</div>
44
45<!-- *********************************************************************** -->
Chris Lattner58f2c872007-11-02 05:42:52 +000046<div class="doc_section"><a name="idea">User-defined Operators: the Idea</a></div>
Chris Lattnerc9d5d2c2007-11-01 06:49:54 +000047<!-- *********************************************************************** -->
48
49<div class="doc_text">
50
51<p>
Chris Lattner58f2c872007-11-02 05:42:52 +000052The "operator overloading" that we will add to Kaleidoscope is more general than
Chris Lattnerc9d5d2c2007-11-01 06:49:54 +000053languages like C++. In C++, you are only allowed to redefine existing
54operators: you can't programatically change the grammar, introduce new
55operators, change precedence levels, etc. In this chapter, we will add this
56capability to Kaleidoscope, which will allow us to round out the set of
57operators that are supported, culminating in a more interesting example app.</p>
58
Chris Lattner58f2c872007-11-02 05:42:52 +000059<p>The point of going into user-defined operators in a tutorial like this is to
Chris Lattnerc9d5d2c2007-11-01 06:49:54 +000060show the power and flexibility of using a hand-written parser. The parser we
61are using so far is using recursive descent for most parts of the grammar, and
62operator precedence parsing for the expressions. See <a
63href="LangImpl2.html">Chapter 2</a> for details. Without using operator
64precedence parsing, it would be very difficult to allow the programmer to
65introduce new operators into the grammar: the grammar is dynamically extensible
66as the JIT runs.</p>
67
68<p>The two specific features we'll add are programmable unary operators (right
69now, Kaleidoscope has no unary operators at all) as well as binary operators.
70An example of this is:</p>
71
72<div class="doc_code">
73<pre>
74# Logical unary not.
75def unary!(v)
76 if v then
77 0
78 else
79 1;
80
81# Define &gt; with the same precedence as &lt;.
82def binary&gt; 10 (LHS RHS)
83 !(LHS &lt; RHS); # alternatively, could just use "RHS &lt; LHS"
84
85# Binary "logical or", (note that it does not "short circuit")
86def binary| 5 (LHS RHS)
87 if LHS then
88 1
89 else if RHS then
90 1
91 else
92 0;
93
94# Define = with slightly lower precedence than relationals.
95def binary= 9 (LHS RHS)
96 !(LHS &lt; RHS | LHS &gt; RHS);
97</pre>
98</div>
99
100<p>Many languages aspire to being able to implement their standard runtime
101library in the language itself. In Kaleidoscope, we can implement significant
102parts of the language in the library!</p>
103
104<p>We will break down implementation of these features into two parts:
Chris Lattner58f2c872007-11-02 05:42:52 +0000105implementing support for user-defined binary operators and adding unary
Chris Lattnerc9d5d2c2007-11-01 06:49:54 +0000106operators.</p>
107
108</div>
109
110<!-- *********************************************************************** -->
Chris Lattner58f2c872007-11-02 05:42:52 +0000111<div class="doc_section"><a name="binary">User-defined Binary Operators</a></div>
Chris Lattnerc9d5d2c2007-11-01 06:49:54 +0000112<!-- *********************************************************************** -->
113
114<div class="doc_text">
115
Chris Lattner58f2c872007-11-02 05:42:52 +0000116<p>Adding support for user-defined binary operators is pretty simple with our
Chris Lattnerc9d5d2c2007-11-01 06:49:54 +0000117current framework. We'll first add support for the unary/binary keywords:</p>
118
119<div class="doc_code">
120<pre>
121enum Token {
122 ...
123 <b>// operators
124 tok_binary = -11, tok_unary = -12</b>
125};
126...
127static int gettok() {
128...
129 if (IdentifierStr == "for") return tok_for;
130 if (IdentifierStr == "in") return tok_in;
131 <b>if (IdentifierStr == "binary") return tok_binary;
132 if (IdentifierStr == "unary") return tok_unary;</b>
133 return tok_identifier;
134</pre>
135</div>
136
137<p>This just adds lexer support for the unary and binary keywords, like we
138did in <a href="LangImpl5.html#iflexer">previous chapters</a>. One nice thing
139about our current AST is that we represent binary operators fully generally
140with their ASCII code as the opcode. For our extended operators, we'll use the
141same representation, so we don't need any new AST or parser support.</p>
142
143<p>On the other hand, we have to be able to represent the definitions of these
144new operators, in the "def binary| 5" part of the function definition. In the
145grammar so far, the "name" for the function definition is parsed as the
146"prototype" production and into the <tt>PrototypeAST</tt> AST node. To
147represent our new user-defined operators as prototypes, we have to extend
148the <tt>PrototypeAST</tt> AST node like this:</p>
149
150<div class="doc_code">
151<pre>
152/// PrototypeAST - This class represents the "prototype" for a function,
153/// which captures its argument names as well as if it is an operator.
154class PrototypeAST {
155 std::string Name;
156 std::vector&lt;std::string&gt; Args;
157 <b>bool isOperator;
158 unsigned Precedence; // Precedence if a binary op.</b>
159public:
160 PrototypeAST(const std::string &amp;name, const std::vector&lt;std::string&gt; &amp;args,
161 <b>bool isoperator = false, unsigned prec = 0</b>)
162 : Name(name), Args(args), <b>isOperator(isoperator), Precedence(prec)</b> {}
163
164 <b>bool isUnaryOp() const { return isOperator &amp;&amp; Args.size() == 1; }
165 bool isBinaryOp() const { return isOperator &amp;&amp; Args.size() == 2; }
166
167 char getOperatorName() const {
168 assert(isUnaryOp() || isBinaryOp());
169 return Name[Name.size()-1];
170 }
171
172 unsigned getBinaryPrecedence() const { return Precedence; }</b>
173
174 Function *Codegen();
175};
176</pre>
177</div>
178
179<p>Basically, in addition to knowing a name for the prototype, we now keep track
180of whether it was an operator, and if it was, what precedence level the operator
Chris Lattner58f2c872007-11-02 05:42:52 +0000181is at. The precedence is only used for binary operators (as you'll see below,
182it just doesn't apply for unary operators). Now that we have a way to represent
183the prototype for a user-defined operator, we need to parse it:</p>
184
185<div class="doc_code">
186<pre>
187/// prototype
188/// ::= id '(' id* ')'
189<b>/// ::= binary LETTER number? (id, id)</b>
190static PrototypeAST *ParsePrototype() {
191 std::string FnName;
192
193 <b>int Kind = 0; // 0 = identifier, 1 = unary, 2 = binary.
194 unsigned BinaryPrecedence = 30;</b>
195
196 switch (CurTok) {
197 default:
198 return ErrorP("Expected function name in prototype");
199 case tok_identifier:
200 FnName = IdentifierStr;
201 Kind = 0;
202 getNextToken();
203 break;
204 <b>case tok_binary:
205 getNextToken();
206 if (!isascii(CurTok))
207 return ErrorP("Expected binary operator");
208 FnName = "binary";
209 FnName += (char)CurTok;
210 Kind = 2;
211 getNextToken();
212
213 // Read the precedence if present.
214 if (CurTok == tok_number) {
215 if (NumVal &lt; 1 || NumVal &gt; 100)
216 return ErrorP("Invalid precedecnce: must be 1..100");
217 BinaryPrecedence = (unsigned)NumVal;
218 getNextToken();
219 }
220 break;</b>
221 }
222
223 if (CurTok != '(')
224 return ErrorP("Expected '(' in prototype");
225
226 std::vector&lt;std::string&gt; ArgNames;
227 while (getNextToken() == tok_identifier)
228 ArgNames.push_back(IdentifierStr);
229 if (CurTok != ')')
230 return ErrorP("Expected ')' in prototype");
231
232 // success.
233 getNextToken(); // eat ')'.
234
235 <b>// Verify right number of names for operator.
236 if (Kind &amp;&amp; ArgNames.size() != Kind)
237 return ErrorP("Invalid number of operands for operator");
238
239 return new PrototypeAST(FnName, ArgNames, Kind != 0, BinaryPrecedence);</b>
240}
241</pre>
242</div>
243
244<p>This is all fairly straight-forward parsing code, and we have already seen
245a lot of similar code in the past. One interesting piece of this is the part
246that sets up <tt>FnName</tt> for binary operators. This builds names like
247"binary@" for a newly defined "@" operator. This takes advantage of the fact
248that symbol names in the LLVM symbol table are allowed to have any character in
249them, inluding embedded nul characters.</p>
250
251<p>The next interesting piece is codegen support for these binary operators.
252Given our current structure, this is a simple addition of a default case for our
253existing binary operator node:</p>
254
255<div class="doc_code">
256<pre>
257Value *BinaryExprAST::Codegen() {
258 Value *L = LHS-&gt;Codegen();
259 Value *R = RHS-&gt;Codegen();
260 if (L == 0 || R == 0) return 0;
261
262 switch (Op) {
263 case '+': return Builder.CreateAdd(L, R, "addtmp");
264 case '-': return Builder.CreateSub(L, R, "subtmp");
265 case '*': return Builder.CreateMul(L, R, "multmp");
266 case '&lt;':
267 L = Builder.CreateFCmpULT(L, R, "multmp");
268 // Convert bool 0/1 to double 0.0 or 1.0
269 return Builder.CreateUIToFP(L, Type::DoubleTy, "booltmp");
270 <b>default: break;</b>
271 }
272
273 <b>// If it wasn't a builtin binary operator, it must be a user defined one. Emit
274 // a call to it.
275 Function *F = TheModule-&gt;getFunction(std::string("binary")+Op);
276 assert(F &amp;&amp; "binary operator not found!");
277
278 Value *Ops[] = { L, R };
279 return Builder.CreateCall(F, Ops, Ops+2, "binop");</b>
280}
281
282</pre>
283</div>
284
285<p>As you can see above, the new code is actually really simple. It just does
286a lookup for the appropriate operator in the symbol table and generates a
287function call to it. Since user-defined operators are just built as normal
288functions (because the "prototype" boils down into a function with the right
289name) everything falls into place.</p>
290
291<p>The final missing piece is a bit of top level magic, here:</p>
292
293<div class="doc_code">
294<pre>
295Function *FunctionAST::Codegen() {
296 NamedValues.clear();
297
298 Function *TheFunction = Proto->Codegen();
299 if (TheFunction == 0)
300 return 0;
301
302 <b>// If this is an operator, install it.
303 if (Proto-&gt;isBinaryOp())
304 BinopPrecedence[Proto->getOperatorName()] = Proto->getBinaryPrecedence();</b>
305
306 // Create a new basic block to start insertion into.
307 BasicBlock *BB = new BasicBlock("entry", TheFunction);
308 Builder.SetInsertPoint(BB);
309
310 if (Value *RetVal = Body-&gt;Codegen()) {
311 ...
312</pre>
313</div>
314
315<p>Basically, before codegening a function, if it is a user-defined operator, we
316register it in the precedence table. This allows the binary operator parsing
317logic we already have to handle it. Since it is a fully-general operator
318precedence parser, this is all we need to do to "extend the grammar".</p>
319
320<p>With that, we have useful user-defined binary operators. This builds a lot
321on the previous framework we built for other operators. Adding unary operators
322is a bit more challenging, because we don't have any framework for it yet, lets
323see what it takes.</p>
324
325</div>
326
327<!-- *********************************************************************** -->
328<div class="doc_section"><a name="unary">User-defined Unary Operators</a></div>
329<!-- *********************************************************************** -->
330
331<div class="doc_text">
332
333<p>Since we don't currently support unary operators in the Kaleidoscope
334langugage, we'll need to add everything for them. Above, we added simple
335support for the 'unary' keyword to the lexer. In addition to that, we need an
336AST node:</p>
337
338<div class="doc_code">
339<pre>
340/// UnaryExprAST - Expression class for a unary operator.
341class UnaryExprAST : public ExprAST {
342 char Opcode;
343 ExprAST *Operand;
344public:
345 UnaryExprAST(char opcode, ExprAST *operand)
346 : Opcode(opcode), Operand(operand) {}
347 virtual Value *Codegen();
348};
349</pre>
350</div>
351
352<p>This AST node is very simple and obvious by now. It directly mirrors the
353binary operator AST node, except that it only has one child. With this, we
354need to add the parsing logic. Parsing a unary operator is pretty simple: we'll
355add a new function to do it:</p>
356
357<div class="doc_code">
358<pre>
359/// unary
360/// ::= primary
361/// ::= '!' unary
362static ExprAST *ParseUnary() {
363 // If the current token is not an operator, it must be a primary expr.
364 if (!isascii(CurTok) || CurTok == '(' || CurTok == ',')
365 return ParsePrimary();
366
367 // If this is a unary operator, read it.
368 int Opc = CurTok;
369 getNextToken();
370 if (ExprAST *Operand = ParseUnary())
371 return new UnaryExprAST(Opc, Operand);
372 return 0;
373}
374</pre>
375</div>
376
377<p>The grammar we add is pretty straight-forward here. If we see a unary
378operator when parsing a primary operator, we eat the operator as a prefix and
379parse the remaining piece as another unary operator. This allows us to handle
380multiple unary operators (e.g. "!!x"). Note that unary operators can't have
381ambiguous parses like binary operators can, so there is no need for precedence
382information.</p>
383
384<p>The problem with the above is that we need to call ParseUnary from somewhere.
385To do this, we change previous callers of ParsePrimary to call ParseUnary
386instead:</p>
387
388<div class="doc_code">
389<pre>
390/// binoprhs
391/// ::= ('+' unary)*
392static ExprAST *ParseBinOpRHS(int ExprPrec, ExprAST *LHS) {
393 ...
394 <b>// Parse the unary expression after the binary operator.
395 ExprAST *RHS = ParseUnary();
396 if (!RHS) return 0;</b>
397 ...
398}
399/// expression
400/// ::= unary binoprhs
401///
402static ExprAST *ParseExpression() {
403 <b>ExprAST *LHS = ParseUnary();</b>
404 if (!LHS) return 0;
405
406 return ParseBinOpRHS(0, LHS);
407}
408</pre>
409</div>
410
411<p>With these two simple changes, we now parse unary operators and build the
412AST for them. Next up, we need to add parser support for prototypes, to parse
413the unary operator prototype. We extend the binary operator code above
414with:</p>
415
416<div class="doc_code">
417<pre>
418/// prototype
419/// ::= id '(' id* ')'
420/// ::= binary LETTER number? (id, id)
421<b>/// ::= unary LETTER (id)</b>
422static PrototypeAST *ParsePrototype() {
423 std::string FnName;
424
425 int Kind = 0; // 0 = identifier, 1 = unary, 2 = binary.
426 unsigned BinaryPrecedence = 30;
427
428 switch (CurTok) {
429 default:
430 return ErrorP("Expected function name in prototype");
431 case tok_identifier:
432 FnName = IdentifierStr;
433 Kind = 0;
434 getNextToken();
435 break;
436 <b>case tok_unary:
437 getNextToken();
438 if (!isascii(CurTok))
439 return ErrorP("Expected unary operator");
440 FnName = "unary";
441 FnName += (char)CurTok;
442 Kind = 1;
443 getNextToken();
444 break;</b>
445 case tok_binary:
446 ...
447</pre>
448</div>
449
450<p>As with binary operators, we name unary operators with a name that includes
451the operator character. This assists us at code generation time. Speaking of,
452the final piece we need to add is codegen support for unary operators. It looks
453like this:</p>
454
455<div class="doc_code">
456<pre>
457Value *UnaryExprAST::Codegen() {
458 Value *OperandV = Operand->Codegen();
459 if (OperandV == 0) return 0;
460
461 Function *F = TheModule->getFunction(std::string("unary")+Opcode);
462 if (F == 0)
463 return ErrorV("Unknown unary operator");
464
465 return Builder.CreateCall(F, OperandV, "unop");
466}
467</pre>
468</div>
469
470<p>This code is similar to, but simpler than, the code for binary operators. It
471is simpler primarily because it doesn't need to handle any predefined operators.
472</p>
473
474</div>
475
476<!-- *********************************************************************** -->
477<div class="doc_section"><a name="example">Kicking the Tires</a></div>
478<!-- *********************************************************************** -->
479
480<div class="doc_text">
481
482<p>It is somewhat hard to believe, but with a few simple extensions we've
Chris Lattnerb5d81b32007-11-02 05:54:25 +0000483covered in the last chapters, we have grown a real-ish language. With this, we
Chris Lattner58f2c872007-11-02 05:42:52 +0000484can do a lot of interesting things, including I/O, math, and a bunch of other
485things. For example, we can now add a nice sequencing operator (printd is
486defined to print out the specified value and a newline):</p>
487
488<div class="doc_code">
489<pre>
490ready&gt; <b>extern printd(x);</b>
491Read extern: declare double @printd(double)
492ready&gt; <b>def binary : 1 (x y) 0; # Low-precedence operator that ignores operands.</b>
493..
494ready&gt; <b>printd(123) : printd(456) : printd(789);</b>
495123.000000
496456.000000
497789.000000
498Evaluated to 0.000000
499</pre>
500</div>
501
Chris Lattnerb5d81b32007-11-02 05:54:25 +0000502<p>We can also define a bunch of other "primitive" operations, such as:</p>
Chris Lattner58f2c872007-11-02 05:42:52 +0000503
504<div class="doc_code">
505<pre>
506# Logical unary not.
507def unary!(v)
508 if v then
509 0
510 else
511 1;
512
513# Unary negate.
514def unary-(v)
515 0-v;
516
517# Define &gt; with the same precedence as &gt;. We could also easily define
518# &lt;= etc.
519def binary&gt; 10 (LHS RHS)
520 !(LHS &lt; RHS);
521
522# Binary logical or, which does not short circuit.
523def binary| 5 (LHS RHS)
524 if LHS then
525 1
526 else if RHS then
527 1
528 else
529 0;
530
531# Binary logical and, which does not short circuit.
532def binary&amp; 6 (LHS RHS)
533 if !LHS then
534 0
535 else
536 !!RHS;
537
538# Define = with slightly lower precedence than relationals.
539def binary = 9 (LHS RHS)
540 !(LHS &lt; RHS | LHS &gt; RHS);
541
542</pre>
543</div>
Chris Lattnerc9d5d2c2007-11-01 06:49:54 +0000544
545
Chris Lattner58f2c872007-11-02 05:42:52 +0000546<p>Given the previous if/then/else support, we can also define interesting
547functions for I/O. For example, the following prints out a character whose
548"density" reflects the value passed in: the lower the value, the denser the
549character:</p>
550
551<div class="doc_code">
552<pre>
553ready&gt;
554<b>
555extern putchard(char)
556def printdensity(d)
557 if d &gt; 8 then
558 putchard(32) # ' '
559 else if d &gt; 4 then
560 putchard(46) # '.'
561 else if d &gt; 2 then
562 putchard(43) # '+'
563 else
564 putchard(42); # '*'</b>
565...
566ready&gt; <b>printdensity(1): printdensity(2): printdensity(3) :
567 printdensity(4): printdensity(5): printdensity(9): putchard(10);</b>
568*++..
569Evaluated to 0.000000
570</pre>
571</div>
572
573<p>Based on these simple primitive operations, we can start to define more
574interesting things. For example, here's a little function that solves for the
575number of iterations it takes for a function in the complex plane to
576converge:</p>
577
578<div class="doc_code">
579<pre>
580# determine whether the specific location diverges.
581# Solve for z = z^2 + c in the complex plane.
582def mandleconverger(real imag iters creal cimag)
583 if iters &gt; 255 | (real*real + imag*imag &gt; 4) then
584 iters
585 else
586 mandleconverger(real*real - imag*imag + creal,
587 2*real*imag + cimag,
588 iters+1, creal, cimag);
589
590# return the number of iterations required for the iteration to escape
591def mandleconverge(real imag)
592 mandleconverger(real, imag, 0, real, imag);
593</pre>
594</div>
595
Chris Lattnerb5d81b32007-11-02 05:54:25 +0000596<p>This "z = z<sup>2</sup> + c" function is a beautiful little creature that is the basis
Chris Lattner58f2c872007-11-02 05:42:52 +0000597for computation of the <a
598href="http://en.wikipedia.org/wiki/Mandelbrot_set">Mandelbrot Set</a>. Our
599<tt>mandelconverge</tt> function returns the number of iterations that it takes
600for a complex orbit to escape, saturating to 255. This is not a very useful
601function by itself, but if you plot its value over a two-dimensional plane,
602you can see the mandelbrot set. Given that we are limited to using putchard
603here, our amazing graphical output is limited, but we can whip together
604something using the density plotter above:</p>
605
606<div class="doc_code">
607<pre>
608# compute and plot the mandlebrot set with the specified 2 dimentional range
609# info.
610def mandelhelp(xmin xmax xstep ymin ymax ystep)
611 for y = ymin, y &lt; ymax, ystep in (
612 (for x = xmin, x &lt; xmax, xstep in
613 printdensity(mandleconverge(x,y)))
614 : putchard(10)
615 )
616
617# mandel - This is a convenient helper function for ploting the mandelbrot set
618# from the specified position with the specified magnification.
619def mandel(realstart imagstart realmag imagmag)
620 mandelhelp(realstart, realstart+realmag*78, realmag,
621 imagstart, imagstart+imagmag*40, imagmag);
622</pre>
623</div>
624
625<p>Given this, we can try plotting out the mandlebrot set! Lets try it out:</p>
626
627<div class="doc_code">
628<pre>
629ready&gt; <b>mandel(-2.3, -1.3, 0.05, 0.07);</b>
630*******************************+++++++++++*************************************
631*************************+++++++++++++++++++++++*******************************
632**********************+++++++++++++++++++++++++++++****************************
633*******************+++++++++++++++++++++.. ...++++++++*************************
634*****************++++++++++++++++++++++.... ...+++++++++***********************
635***************+++++++++++++++++++++++..... ...+++++++++*********************
636**************+++++++++++++++++++++++.... ....+++++++++********************
637*************++++++++++++++++++++++...... .....++++++++*******************
638************+++++++++++++++++++++....... .......+++++++******************
639***********+++++++++++++++++++.... ... .+++++++*****************
640**********+++++++++++++++++....... .+++++++****************
641*********++++++++++++++........... ...+++++++***************
642********++++++++++++............ ...++++++++**************
643********++++++++++... .......... .++++++++**************
644*******+++++++++..... .+++++++++*************
645*******++++++++...... ..+++++++++*************
646*******++++++....... ..+++++++++*************
647*******+++++...... ..+++++++++*************
648*******.... .... ...+++++++++*************
649*******.... . ...+++++++++*************
650*******+++++...... ...+++++++++*************
651*******++++++....... ..+++++++++*************
652*******++++++++...... .+++++++++*************
653*******+++++++++..... ..+++++++++*************
654********++++++++++... .......... .++++++++**************
655********++++++++++++............ ...++++++++**************
656*********++++++++++++++.......... ...+++++++***************
657**********++++++++++++++++........ .+++++++****************
658**********++++++++++++++++++++.... ... ..+++++++****************
659***********++++++++++++++++++++++....... .......++++++++*****************
660************+++++++++++++++++++++++...... ......++++++++******************
661**************+++++++++++++++++++++++.... ....++++++++********************
662***************+++++++++++++++++++++++..... ...+++++++++*********************
663*****************++++++++++++++++++++++.... ...++++++++***********************
664*******************+++++++++++++++++++++......++++++++*************************
665*********************++++++++++++++++++++++.++++++++***************************
666*************************+++++++++++++++++++++++*******************************
667******************************+++++++++++++************************************
668*******************************************************************************
669*******************************************************************************
670*******************************************************************************
671Evaluated to 0.000000
672ready&gt; <b>mandel(-2, -1, 0.02, 0.04);</b>
673**************************+++++++++++++++++++++++++++++++++++++++++++++++++++++
674***********************++++++++++++++++++++++++++++++++++++++++++++++++++++++++
675*********************+++++++++++++++++++++++++++++++++++++++++++++++++++++++++.
676*******************+++++++++++++++++++++++++++++++++++++++++++++++++++++++++...
677*****************+++++++++++++++++++++++++++++++++++++++++++++++++++++++++.....
678***************++++++++++++++++++++++++++++++++++++++++++++++++++++++++........
679**************++++++++++++++++++++++++++++++++++++++++++++++++++++++...........
680************+++++++++++++++++++++++++++++++++++++++++++++++++++++..............
681***********++++++++++++++++++++++++++++++++++++++++++++++++++........ .
682**********++++++++++++++++++++++++++++++++++++++++++++++.............
683********+++++++++++++++++++++++++++++++++++++++++++..................
684*******+++++++++++++++++++++++++++++++++++++++.......................
685******+++++++++++++++++++++++++++++++++++...........................
686*****++++++++++++++++++++++++++++++++............................
687*****++++++++++++++++++++++++++++...............................
688****++++++++++++++++++++++++++...... .........................
689***++++++++++++++++++++++++......... ...... ...........
690***++++++++++++++++++++++............
691**+++++++++++++++++++++..............
692**+++++++++++++++++++................
693*++++++++++++++++++.................
694*++++++++++++++++............ ...
695*++++++++++++++..............
696*+++....++++................
697*.......... ...........
698*
699*.......... ...........
700*+++....++++................
701*++++++++++++++..............
702*++++++++++++++++............ ...
703*++++++++++++++++++.................
704**+++++++++++++++++++................
705**+++++++++++++++++++++..............
706***++++++++++++++++++++++............
707***++++++++++++++++++++++++......... ...... ...........
708****++++++++++++++++++++++++++...... .........................
709*****++++++++++++++++++++++++++++...............................
710*****++++++++++++++++++++++++++++++++............................
711******+++++++++++++++++++++++++++++++++++...........................
712*******+++++++++++++++++++++++++++++++++++++++.......................
713********+++++++++++++++++++++++++++++++++++++++++++..................
714Evaluated to 0.000000
715ready&gt; <b>mandel(-0.9, -1.4, 0.02, 0.03);</b>
716*******************************************************************************
717*******************************************************************************
718*******************************************************************************
719**********+++++++++++++++++++++************************************************
720*+++++++++++++++++++++++++++++++++++++++***************************************
721+++++++++++++++++++++++++++++++++++++++++++++**********************************
722++++++++++++++++++++++++++++++++++++++++++++++++++*****************************
723++++++++++++++++++++++++++++++++++++++++++++++++++++++*************************
724+++++++++++++++++++++++++++++++++++++++++++++++++++++++++**********************
725+++++++++++++++++++++++++++++++++.........++++++++++++++++++*******************
726+++++++++++++++++++++++++++++++.... ......+++++++++++++++++++****************
727+++++++++++++++++++++++++++++....... ........+++++++++++++++++++**************
728++++++++++++++++++++++++++++........ ........++++++++++++++++++++************
729+++++++++++++++++++++++++++......... .. ...+++++++++++++++++++++**********
730++++++++++++++++++++++++++........... ....++++++++++++++++++++++********
731++++++++++++++++++++++++............. .......++++++++++++++++++++++******
732+++++++++++++++++++++++............. ........+++++++++++++++++++++++****
733++++++++++++++++++++++........... ..........++++++++++++++++++++++***
734++++++++++++++++++++........... .........++++++++++++++++++++++*
735++++++++++++++++++............ ...........++++++++++++++++++++
736++++++++++++++++............... .............++++++++++++++++++
737++++++++++++++................. ...............++++++++++++++++
738++++++++++++.................. .................++++++++++++++
739+++++++++.................. .................+++++++++++++
740++++++........ . ......... ..++++++++++++
741++............ ...... ....++++++++++
742.............. ...++++++++++
743.............. ....+++++++++
744.............. .....++++++++
745............. ......++++++++
746........... .......++++++++
747......... ........+++++++
748......... ........+++++++
749......... ....+++++++
750........ ...+++++++
751....... ...+++++++
752 ....+++++++
753 .....+++++++
754 ....+++++++
755 ....+++++++
756 ....+++++++
757Evaluated to 0.000000
758ready&gt; <b>^D</b>
759</pre>
760</div>
761
762<p>At this point, you may be starting to realize that Kaleidoscope is a real
763and powerful language. It may not be self-similar :), but it can be used to
764plot things that are!</p>
765
766<p>With this, we conclude the "adding user-defined operators" chapter of the
767tutorial. We successfully extended our language with the ability to extend the
768language in the library, and showed how this can be used to build a simple but
769interesting end user application in Kaleidoscope. At this point, Kaleidoscope
770can build a variety of applications that are functional and can call functions
Chris Lattnerb5d81b32007-11-02 05:54:25 +0000771with side-effects, but it can't actually define and mutate a variable itself.
Chris Lattner58f2c872007-11-02 05:42:52 +0000772</p>
773
774<p>Strikingly, lack of this feature is an important limitation for some
775languages, and it is not at all obvious how to <a href="LangImpl7.html">add
776support for mutable variables</a> without having to add an "SSA construction"
777phase to your front-end. In the next chapter, we will describe how you can
778add this without building SSA in your front-end.</p>
Chris Lattnerc9d5d2c2007-11-01 06:49:54 +0000779
780</div>
781
782
783<!-- *********************************************************************** -->
784<div class="doc_section"><a name="code">Full Code Listing</a></div>
785<!-- *********************************************************************** -->
786
787<div class="doc_text">
788
789<p>
790Here is the complete code listing for our running example, enhanced with the
791if/then/else and for expressions.. To build this example, use:
792</p>
793
794<div class="doc_code">
795<pre>
796 # Compile
797 g++ -g toy.cpp `llvm-config --cppflags --ldflags --libs core jit native` -O3 -o toy
798 # Run
799 ./toy
800</pre>
801</div>
802
803<p>Here is the code:</p>
804
805<div class="doc_code">
806<pre>
Chris Lattner58f2c872007-11-02 05:42:52 +0000807#include "llvm/DerivedTypes.h"
808#include "llvm/ExecutionEngine/ExecutionEngine.h"
809#include "llvm/Module.h"
810#include "llvm/ModuleProvider.h"
811#include "llvm/PassManager.h"
812#include "llvm/Analysis/Verifier.h"
813#include "llvm/Target/TargetData.h"
814#include "llvm/Transforms/Scalar.h"
815#include "llvm/Support/LLVMBuilder.h"
816#include &lt;cstdio&gt;
817#include &lt;string&gt;
818#include &lt;map&gt;
819#include &lt;vector&gt;
820using namespace llvm;
821
822//===----------------------------------------------------------------------===//
823// Lexer
824//===----------------------------------------------------------------------===//
825
826// The lexer returns tokens [0-255] if it is an unknown character, otherwise one
827// of these for known things.
828enum Token {
829 tok_eof = -1,
830
831 // commands
832 tok_def = -2, tok_extern = -3,
833
834 // primary
835 tok_identifier = -4, tok_number = -5,
836
837 // control
838 tok_if = -6, tok_then = -7, tok_else = -8,
839 tok_for = -9, tok_in = -10,
840
841 // operators
842 tok_binary = -11, tok_unary = -12
843};
844
845static std::string IdentifierStr; // Filled in if tok_identifier
846static double NumVal; // Filled in if tok_number
847
848/// gettok - Return the next token from standard input.
849static int gettok() {
850 static int LastChar = ' ';
851
852 // Skip any whitespace.
853 while (isspace(LastChar))
854 LastChar = getchar();
855
856 if (isalpha(LastChar)) { // identifier: [a-zA-Z][a-zA-Z0-9]*
857 IdentifierStr = LastChar;
858 while (isalnum((LastChar = getchar())))
859 IdentifierStr += LastChar;
860
861 if (IdentifierStr == "def") return tok_def;
862 if (IdentifierStr == "extern") return tok_extern;
863 if (IdentifierStr == "if") return tok_if;
864 if (IdentifierStr == "then") return tok_then;
865 if (IdentifierStr == "else") return tok_else;
866 if (IdentifierStr == "for") return tok_for;
867 if (IdentifierStr == "in") return tok_in;
868 if (IdentifierStr == "binary") return tok_binary;
869 if (IdentifierStr == "unary") return tok_unary;
870 return tok_identifier;
871 }
872
873 if (isdigit(LastChar) || LastChar == '.') { // Number: [0-9.]+
874 std::string NumStr;
875 do {
876 NumStr += LastChar;
877 LastChar = getchar();
878 } while (isdigit(LastChar) || LastChar == '.');
879
880 NumVal = strtod(NumStr.c_str(), 0);
881 return tok_number;
882 }
883
884 if (LastChar == '#') {
885 // Comment until end of line.
886 do LastChar = getchar();
887 while (LastChar != EOF &amp;&amp; LastChar != '\n' &amp; LastChar != '\r');
888
889 if (LastChar != EOF)
890 return gettok();
891 }
892
893 // Check for end of file. Don't eat the EOF.
894 if (LastChar == EOF)
895 return tok_eof;
896
897 // Otherwise, just return the character as its ascii value.
898 int ThisChar = LastChar;
899 LastChar = getchar();
900 return ThisChar;
901}
902
903//===----------------------------------------------------------------------===//
904// Abstract Syntax Tree (aka Parse Tree)
905//===----------------------------------------------------------------------===//
906
907/// ExprAST - Base class for all expression nodes.
908class ExprAST {
909public:
910 virtual ~ExprAST() {}
911 virtual Value *Codegen() = 0;
912};
913
914/// NumberExprAST - Expression class for numeric literals like "1.0".
915class NumberExprAST : public ExprAST {
916 double Val;
917public:
918 NumberExprAST(double val) : Val(val) {}
919 virtual Value *Codegen();
920};
921
922/// VariableExprAST - Expression class for referencing a variable, like "a".
923class VariableExprAST : public ExprAST {
924 std::string Name;
925public:
926 VariableExprAST(const std::string &amp;name) : Name(name) {}
927 virtual Value *Codegen();
928};
929
930/// UnaryExprAST - Expression class for a unary operator.
931class UnaryExprAST : public ExprAST {
932 char Opcode;
933 ExprAST *Operand;
934public:
935 UnaryExprAST(char opcode, ExprAST *operand)
936 : Opcode(opcode), Operand(operand) {}
937 virtual Value *Codegen();
938};
939
940/// BinaryExprAST - Expression class for a binary operator.
941class BinaryExprAST : public ExprAST {
942 char Op;
943 ExprAST *LHS, *RHS;
944public:
945 BinaryExprAST(char op, ExprAST *lhs, ExprAST *rhs)
946 : Op(op), LHS(lhs), RHS(rhs) {}
947 virtual Value *Codegen();
948};
949
950/// CallExprAST - Expression class for function calls.
951class CallExprAST : public ExprAST {
952 std::string Callee;
953 std::vector&lt;ExprAST*&gt; Args;
954public:
955 CallExprAST(const std::string &amp;callee, std::vector&lt;ExprAST*&gt; &amp;args)
956 : Callee(callee), Args(args) {}
957 virtual Value *Codegen();
958};
959
960/// IfExprAST - Expression class for if/then/else.
961class IfExprAST : public ExprAST {
962 ExprAST *Cond, *Then, *Else;
963public:
964 IfExprAST(ExprAST *cond, ExprAST *then, ExprAST *_else)
965 : Cond(cond), Then(then), Else(_else) {}
966 virtual Value *Codegen();
967};
968
969/// ForExprAST - Expression class for for/in.
970class ForExprAST : public ExprAST {
971 std::string VarName;
972 ExprAST *Start, *End, *Step, *Body;
973public:
974 ForExprAST(const std::string &amp;varname, ExprAST *start, ExprAST *end,
975 ExprAST *step, ExprAST *body)
976 : VarName(varname), Start(start), End(end), Step(step), Body(body) {}
977 virtual Value *Codegen();
978};
979
980/// PrototypeAST - This class represents the "prototype" for a function,
981/// which captures its argument names as well as if it is an operator.
982class PrototypeAST {
983 std::string Name;
984 std::vector&lt;std::string&gt; Args;
985 bool isOperator;
986 unsigned Precedence; // Precedence if a binary op.
987public:
988 PrototypeAST(const std::string &amp;name, const std::vector&lt;std::string&gt; &amp;args,
989 bool isoperator = false, unsigned prec = 0)
990 : Name(name), Args(args), isOperator(isoperator), Precedence(prec) {}
991
992 bool isUnaryOp() const { return isOperator &amp;&amp; Args.size() == 1; }
993 bool isBinaryOp() const { return isOperator &amp;&amp; Args.size() == 2; }
994
995 char getOperatorName() const {
996 assert(isUnaryOp() || isBinaryOp());
997 return Name[Name.size()-1];
998 }
999
1000 unsigned getBinaryPrecedence() const { return Precedence; }
1001
1002 Function *Codegen();
1003};
1004
1005/// FunctionAST - This class represents a function definition itself.
1006class FunctionAST {
1007 PrototypeAST *Proto;
1008 ExprAST *Body;
1009public:
1010 FunctionAST(PrototypeAST *proto, ExprAST *body)
1011 : Proto(proto), Body(body) {}
1012
1013 Function *Codegen();
1014};
1015
1016//===----------------------------------------------------------------------===//
1017// Parser
1018//===----------------------------------------------------------------------===//
1019
1020/// CurTok/getNextToken - Provide a simple token buffer. CurTok is the current
1021/// token the parser it looking at. getNextToken reads another token from the
1022/// lexer and updates CurTok with its results.
1023static int CurTok;
1024static int getNextToken() {
1025 return CurTok = gettok();
1026}
1027
1028/// BinopPrecedence - This holds the precedence for each binary operator that is
1029/// defined.
1030static std::map&lt;char, int&gt; BinopPrecedence;
1031
1032/// GetTokPrecedence - Get the precedence of the pending binary operator token.
1033static int GetTokPrecedence() {
1034 if (!isascii(CurTok))
1035 return -1;
1036
1037 // Make sure it's a declared binop.
1038 int TokPrec = BinopPrecedence[CurTok];
1039 if (TokPrec &lt;= 0) return -1;
1040 return TokPrec;
1041}
1042
1043/// Error* - These are little helper functions for error handling.
1044ExprAST *Error(const char *Str) { fprintf(stderr, "Error: %s\n", Str);return 0;}
1045PrototypeAST *ErrorP(const char *Str) { Error(Str); return 0; }
1046FunctionAST *ErrorF(const char *Str) { Error(Str); return 0; }
1047
1048static ExprAST *ParseExpression();
1049
1050/// identifierexpr
Chris Lattner20a0c802007-11-05 17:54:34 +00001051/// ::= identifier
1052/// ::= identifier '(' expression* ')'
Chris Lattner58f2c872007-11-02 05:42:52 +00001053static ExprAST *ParseIdentifierExpr() {
1054 std::string IdName = IdentifierStr;
1055
Chris Lattner20a0c802007-11-05 17:54:34 +00001056 getNextToken(); // eat identifier.
Chris Lattner58f2c872007-11-02 05:42:52 +00001057
1058 if (CurTok != '(') // Simple variable ref.
1059 return new VariableExprAST(IdName);
1060
1061 // Call.
1062 getNextToken(); // eat (
1063 std::vector&lt;ExprAST*&gt; Args;
1064 if (CurTok != ')') {
1065 while (1) {
1066 ExprAST *Arg = ParseExpression();
1067 if (!Arg) return 0;
1068 Args.push_back(Arg);
1069
1070 if (CurTok == ')') break;
1071
1072 if (CurTok != ',')
1073 return Error("Expected ')'");
1074 getNextToken();
1075 }
1076 }
1077
1078 // Eat the ')'.
1079 getNextToken();
1080
1081 return new CallExprAST(IdName, Args);
1082}
1083
1084/// numberexpr ::= number
1085static ExprAST *ParseNumberExpr() {
1086 ExprAST *Result = new NumberExprAST(NumVal);
1087 getNextToken(); // consume the number
1088 return Result;
1089}
1090
1091/// parenexpr ::= '(' expression ')'
1092static ExprAST *ParseParenExpr() {
1093 getNextToken(); // eat (.
1094 ExprAST *V = ParseExpression();
1095 if (!V) return 0;
1096
1097 if (CurTok != ')')
1098 return Error("expected ')'");
1099 getNextToken(); // eat ).
1100 return V;
1101}
1102
1103/// ifexpr ::= 'if' expression 'then' expression 'else' expression
1104static ExprAST *ParseIfExpr() {
1105 getNextToken(); // eat the if.
1106
1107 // condition.
1108 ExprAST *Cond = ParseExpression();
1109 if (!Cond) return 0;
1110
1111 if (CurTok != tok_then)
1112 return Error("expected then");
1113 getNextToken(); // eat the then
1114
1115 ExprAST *Then = ParseExpression();
1116 if (Then == 0) return 0;
1117
1118 if (CurTok != tok_else)
1119 return Error("expected else");
1120
1121 getNextToken();
1122
1123 ExprAST *Else = ParseExpression();
1124 if (!Else) return 0;
1125
1126 return new IfExprAST(Cond, Then, Else);
1127}
1128
Chris Lattner20a0c802007-11-05 17:54:34 +00001129/// forexpr ::= 'for' identifier '=' expr ',' expr (',' expr)? 'in' expression
Chris Lattner58f2c872007-11-02 05:42:52 +00001130static ExprAST *ParseForExpr() {
1131 getNextToken(); // eat the for.
1132
1133 if (CurTok != tok_identifier)
1134 return Error("expected identifier after for");
1135
1136 std::string IdName = IdentifierStr;
Chris Lattner20a0c802007-11-05 17:54:34 +00001137 getNextToken(); // eat identifier.
Chris Lattner58f2c872007-11-02 05:42:52 +00001138
1139 if (CurTok != '=')
1140 return Error("expected '=' after for");
1141 getNextToken(); // eat '='.
1142
1143
1144 ExprAST *Start = ParseExpression();
1145 if (Start == 0) return 0;
1146 if (CurTok != ',')
1147 return Error("expected ',' after for start value");
1148 getNextToken();
1149
1150 ExprAST *End = ParseExpression();
1151 if (End == 0) return 0;
1152
1153 // The step value is optional.
1154 ExprAST *Step = 0;
1155 if (CurTok == ',') {
1156 getNextToken();
1157 Step = ParseExpression();
1158 if (Step == 0) return 0;
1159 }
1160
1161 if (CurTok != tok_in)
1162 return Error("expected 'in' after for");
1163 getNextToken(); // eat 'in'.
1164
1165 ExprAST *Body = ParseExpression();
1166 if (Body == 0) return 0;
1167
1168 return new ForExprAST(IdName, Start, End, Step, Body);
1169}
1170
1171
1172/// primary
1173/// ::= identifierexpr
1174/// ::= numberexpr
1175/// ::= parenexpr
1176/// ::= ifexpr
1177/// ::= forexpr
1178static ExprAST *ParsePrimary() {
1179 switch (CurTok) {
1180 default: return Error("unknown token when expecting an expression");
1181 case tok_identifier: return ParseIdentifierExpr();
1182 case tok_number: return ParseNumberExpr();
1183 case '(': return ParseParenExpr();
1184 case tok_if: return ParseIfExpr();
1185 case tok_for: return ParseForExpr();
1186 }
1187}
1188
1189/// unary
1190/// ::= primary
1191/// ::= '!' unary
1192static ExprAST *ParseUnary() {
1193 // If the current token is not an operator, it must be a primary expr.
1194 if (!isascii(CurTok) || CurTok == '(' || CurTok == ',')
1195 return ParsePrimary();
1196
1197 // If this is a unary operator, read it.
1198 int Opc = CurTok;
1199 getNextToken();
1200 if (ExprAST *Operand = ParseUnary())
1201 return new UnaryExprAST(Opc, Operand);
1202 return 0;
1203}
1204
1205/// binoprhs
1206/// ::= ('+' unary)*
1207static ExprAST *ParseBinOpRHS(int ExprPrec, ExprAST *LHS) {
1208 // If this is a binop, find its precedence.
1209 while (1) {
1210 int TokPrec = GetTokPrecedence();
1211
1212 // If this is a binop that binds at least as tightly as the current binop,
1213 // consume it, otherwise we are done.
1214 if (TokPrec &lt; ExprPrec)
1215 return LHS;
1216
1217 // Okay, we know this is a binop.
1218 int BinOp = CurTok;
1219 getNextToken(); // eat binop
1220
1221 // Parse the unary expression after the binary operator.
1222 ExprAST *RHS = ParseUnary();
1223 if (!RHS) return 0;
1224
1225 // If BinOp binds less tightly with RHS than the operator after RHS, let
1226 // the pending operator take RHS as its LHS.
1227 int NextPrec = GetTokPrecedence();
1228 if (TokPrec &lt; NextPrec) {
1229 RHS = ParseBinOpRHS(TokPrec+1, RHS);
1230 if (RHS == 0) return 0;
1231 }
1232
1233 // Merge LHS/RHS.
1234 LHS = new BinaryExprAST(BinOp, LHS, RHS);
1235 }
1236}
1237
1238/// expression
1239/// ::= unary binoprhs
1240///
1241static ExprAST *ParseExpression() {
1242 ExprAST *LHS = ParseUnary();
1243 if (!LHS) return 0;
1244
1245 return ParseBinOpRHS(0, LHS);
1246}
1247
1248/// prototype
1249/// ::= id '(' id* ')'
1250/// ::= binary LETTER number? (id, id)
1251/// ::= unary LETTER (id)
1252static PrototypeAST *ParsePrototype() {
1253 std::string FnName;
1254
1255 int Kind = 0; // 0 = identifier, 1 = unary, 2 = binary.
1256 unsigned BinaryPrecedence = 30;
1257
1258 switch (CurTok) {
1259 default:
1260 return ErrorP("Expected function name in prototype");
1261 case tok_identifier:
1262 FnName = IdentifierStr;
1263 Kind = 0;
1264 getNextToken();
1265 break;
1266 case tok_unary:
1267 getNextToken();
1268 if (!isascii(CurTok))
1269 return ErrorP("Expected unary operator");
1270 FnName = "unary";
1271 FnName += (char)CurTok;
1272 Kind = 1;
1273 getNextToken();
1274 break;
1275 case tok_binary:
1276 getNextToken();
1277 if (!isascii(CurTok))
1278 return ErrorP("Expected binary operator");
1279 FnName = "binary";
1280 FnName += (char)CurTok;
1281 Kind = 2;
1282 getNextToken();
1283
1284 // Read the precedence if present.
1285 if (CurTok == tok_number) {
1286 if (NumVal &lt; 1 || NumVal &gt; 100)
1287 return ErrorP("Invalid precedecnce: must be 1..100");
1288 BinaryPrecedence = (unsigned)NumVal;
1289 getNextToken();
1290 }
1291 break;
1292 }
1293
1294 if (CurTok != '(')
1295 return ErrorP("Expected '(' in prototype");
1296
1297 std::vector&lt;std::string&gt; ArgNames;
1298 while (getNextToken() == tok_identifier)
1299 ArgNames.push_back(IdentifierStr);
1300 if (CurTok != ')')
1301 return ErrorP("Expected ')' in prototype");
1302
1303 // success.
1304 getNextToken(); // eat ')'.
1305
1306 // Verify right number of names for operator.
1307 if (Kind &amp;&amp; ArgNames.size() != Kind)
1308 return ErrorP("Invalid number of operands for operator");
1309
1310 return new PrototypeAST(FnName, ArgNames, Kind != 0, BinaryPrecedence);
1311}
1312
1313/// definition ::= 'def' prototype expression
1314static FunctionAST *ParseDefinition() {
1315 getNextToken(); // eat def.
1316 PrototypeAST *Proto = ParsePrototype();
1317 if (Proto == 0) return 0;
1318
1319 if (ExprAST *E = ParseExpression())
1320 return new FunctionAST(Proto, E);
1321 return 0;
1322}
1323
1324/// toplevelexpr ::= expression
1325static FunctionAST *ParseTopLevelExpr() {
1326 if (ExprAST *E = ParseExpression()) {
1327 // Make an anonymous proto.
1328 PrototypeAST *Proto = new PrototypeAST("", std::vector&lt;std::string&gt;());
1329 return new FunctionAST(Proto, E);
1330 }
1331 return 0;
1332}
1333
1334/// external ::= 'extern' prototype
1335static PrototypeAST *ParseExtern() {
1336 getNextToken(); // eat extern.
1337 return ParsePrototype();
1338}
1339
1340//===----------------------------------------------------------------------===//
1341// Code Generation
1342//===----------------------------------------------------------------------===//
1343
1344static Module *TheModule;
1345static LLVMFoldingBuilder Builder;
1346static std::map&lt;std::string, Value*&gt; NamedValues;
1347static FunctionPassManager *TheFPM;
1348
1349Value *ErrorV(const char *Str) { Error(Str); return 0; }
1350
1351Value *NumberExprAST::Codegen() {
1352 return ConstantFP::get(Type::DoubleTy, APFloat(Val));
1353}
1354
1355Value *VariableExprAST::Codegen() {
1356 // Look this variable up in the function.
1357 Value *V = NamedValues[Name];
1358 return V ? V : ErrorV("Unknown variable name");
1359}
1360
1361Value *UnaryExprAST::Codegen() {
1362 Value *OperandV = Operand-&gt;Codegen();
1363 if (OperandV == 0) return 0;
1364
1365 Function *F = TheModule-&gt;getFunction(std::string("unary")+Opcode);
1366 if (F == 0)
1367 return ErrorV("Unknown unary operator");
1368
1369 return Builder.CreateCall(F, OperandV, "unop");
1370}
1371
1372
1373Value *BinaryExprAST::Codegen() {
1374 Value *L = LHS-&gt;Codegen();
1375 Value *R = RHS-&gt;Codegen();
1376 if (L == 0 || R == 0) return 0;
1377
1378 switch (Op) {
1379 case '+': return Builder.CreateAdd(L, R, "addtmp");
1380 case '-': return Builder.CreateSub(L, R, "subtmp");
1381 case '*': return Builder.CreateMul(L, R, "multmp");
1382 case '&lt;':
1383 L = Builder.CreateFCmpULT(L, R, "multmp");
1384 // Convert bool 0/1 to double 0.0 or 1.0
1385 return Builder.CreateUIToFP(L, Type::DoubleTy, "booltmp");
1386 default: break;
1387 }
1388
1389 // If it wasn't a builtin binary operator, it must be a user defined one. Emit
1390 // a call to it.
1391 Function *F = TheModule-&gt;getFunction(std::string("binary")+Op);
1392 assert(F &amp;&amp; "binary operator not found!");
1393
1394 Value *Ops[] = { L, R };
1395 return Builder.CreateCall(F, Ops, Ops+2, "binop");
1396}
1397
1398Value *CallExprAST::Codegen() {
1399 // Look up the name in the global module table.
1400 Function *CalleeF = TheModule-&gt;getFunction(Callee);
1401 if (CalleeF == 0)
1402 return ErrorV("Unknown function referenced");
1403
1404 // If argument mismatch error.
1405 if (CalleeF-&gt;arg_size() != Args.size())
1406 return ErrorV("Incorrect # arguments passed");
1407
1408 std::vector&lt;Value*&gt; ArgsV;
1409 for (unsigned i = 0, e = Args.size(); i != e; ++i) {
1410 ArgsV.push_back(Args[i]-&gt;Codegen());
1411 if (ArgsV.back() == 0) return 0;
1412 }
1413
1414 return Builder.CreateCall(CalleeF, ArgsV.begin(), ArgsV.end(), "calltmp");
1415}
1416
1417Value *IfExprAST::Codegen() {
1418 Value *CondV = Cond-&gt;Codegen();
1419 if (CondV == 0) return 0;
1420
1421 // Convert condition to a bool by comparing equal to 0.0.
1422 CondV = Builder.CreateFCmpONE(CondV,
1423 ConstantFP::get(Type::DoubleTy, APFloat(0.0)),
1424 "ifcond");
1425
1426 Function *TheFunction = Builder.GetInsertBlock()-&gt;getParent();
1427
1428 // Create blocks for the then and else cases. Insert the 'then' block at the
1429 // end of the function.
1430 BasicBlock *ThenBB = new BasicBlock("then", TheFunction);
1431 BasicBlock *ElseBB = new BasicBlock("else");
1432 BasicBlock *MergeBB = new BasicBlock("ifcont");
1433
1434 Builder.CreateCondBr(CondV, ThenBB, ElseBB);
1435
1436 // Emit then value.
1437 Builder.SetInsertPoint(ThenBB);
1438
1439 Value *ThenV = Then-&gt;Codegen();
1440 if (ThenV == 0) return 0;
1441
1442 Builder.CreateBr(MergeBB);
1443 // Codegen of 'Then' can change the current block, update ThenBB for the PHI.
1444 ThenBB = Builder.GetInsertBlock();
1445
1446 // Emit else block.
1447 TheFunction-&gt;getBasicBlockList().push_back(ElseBB);
1448 Builder.SetInsertPoint(ElseBB);
1449
1450 Value *ElseV = Else-&gt;Codegen();
1451 if (ElseV == 0) return 0;
1452
1453 Builder.CreateBr(MergeBB);
1454 // Codegen of 'Else' can change the current block, update ElseBB for the PHI.
1455 ElseBB = Builder.GetInsertBlock();
1456
1457 // Emit merge block.
1458 TheFunction-&gt;getBasicBlockList().push_back(MergeBB);
1459 Builder.SetInsertPoint(MergeBB);
1460 PHINode *PN = Builder.CreatePHI(Type::DoubleTy, "iftmp");
1461
1462 PN-&gt;addIncoming(ThenV, ThenBB);
1463 PN-&gt;addIncoming(ElseV, ElseBB);
1464 return PN;
1465}
1466
1467Value *ForExprAST::Codegen() {
1468 // Output this as:
1469 // ...
1470 // start = startexpr
1471 // goto loop
1472 // loop:
1473 // variable = phi [start, loopheader], [nextvariable, loopend]
1474 // ...
1475 // bodyexpr
1476 // ...
1477 // loopend:
1478 // step = stepexpr
1479 // nextvariable = variable + step
1480 // endcond = endexpr
1481 // br endcond, loop, endloop
1482 // outloop:
1483
1484 // Emit the start code first, without 'variable' in scope.
1485 Value *StartVal = Start-&gt;Codegen();
1486 if (StartVal == 0) return 0;
1487
1488 // Make the new basic block for the loop header, inserting after current
1489 // block.
1490 Function *TheFunction = Builder.GetInsertBlock()-&gt;getParent();
1491 BasicBlock *PreheaderBB = Builder.GetInsertBlock();
1492 BasicBlock *LoopBB = new BasicBlock("loop", TheFunction);
1493
1494 // Insert an explicit fall through from the current block to the LoopBB.
1495 Builder.CreateBr(LoopBB);
1496
1497 // Start insertion in LoopBB.
1498 Builder.SetInsertPoint(LoopBB);
1499
1500 // Start the PHI node with an entry for Start.
1501 PHINode *Variable = Builder.CreatePHI(Type::DoubleTy, VarName.c_str());
1502 Variable-&gt;addIncoming(StartVal, PreheaderBB);
1503
1504 // Within the loop, the variable is defined equal to the PHI node. If it
1505 // shadows an existing variable, we have to restore it, so save it now.
1506 Value *OldVal = NamedValues[VarName];
1507 NamedValues[VarName] = Variable;
1508
1509 // Emit the body of the loop. This, like any other expr, can change the
1510 // current BB. Note that we ignore the value computed by the body, but don't
1511 // allow an error.
1512 if (Body-&gt;Codegen() == 0)
1513 return 0;
1514
1515 // Emit the step value.
1516 Value *StepVal;
1517 if (Step) {
1518 StepVal = Step-&gt;Codegen();
1519 if (StepVal == 0) return 0;
1520 } else {
1521 // If not specified, use 1.0.
1522 StepVal = ConstantFP::get(Type::DoubleTy, APFloat(1.0));
1523 }
1524
1525 Value *NextVar = Builder.CreateAdd(Variable, StepVal, "nextvar");
1526
1527 // Compute the end condition.
1528 Value *EndCond = End-&gt;Codegen();
1529 if (EndCond == 0) return EndCond;
1530
1531 // Convert condition to a bool by comparing equal to 0.0.
1532 EndCond = Builder.CreateFCmpONE(EndCond,
1533 ConstantFP::get(Type::DoubleTy, APFloat(0.0)),
1534 "loopcond");
1535
1536 // Create the "after loop" block and insert it.
1537 BasicBlock *LoopEndBB = Builder.GetInsertBlock();
1538 BasicBlock *AfterBB = new BasicBlock("afterloop", TheFunction);
1539
1540 // Insert the conditional branch into the end of LoopEndBB.
1541 Builder.CreateCondBr(EndCond, LoopBB, AfterBB);
1542
1543 // Any new code will be inserted in AfterBB.
1544 Builder.SetInsertPoint(AfterBB);
1545
1546 // Add a new entry to the PHI node for the backedge.
1547 Variable-&gt;addIncoming(NextVar, LoopEndBB);
1548
1549 // Restore the unshadowed variable.
1550 if (OldVal)
1551 NamedValues[VarName] = OldVal;
1552 else
1553 NamedValues.erase(VarName);
1554
1555
1556 // for expr always returns 0.0.
1557 return Constant::getNullValue(Type::DoubleTy);
1558}
1559
1560Function *PrototypeAST::Codegen() {
1561 // Make the function type: double(double,double) etc.
1562 std::vector&lt;const Type*&gt; Doubles(Args.size(), Type::DoubleTy);
1563 FunctionType *FT = FunctionType::get(Type::DoubleTy, Doubles, false);
1564
1565 Function *F = new Function(FT, Function::ExternalLinkage, Name, TheModule);
1566
1567 // If F conflicted, there was already something named 'Name'. If it has a
1568 // body, don't allow redefinition or reextern.
1569 if (F-&gt;getName() != Name) {
1570 // Delete the one we just made and get the existing one.
1571 F-&gt;eraseFromParent();
1572 F = TheModule-&gt;getFunction(Name);
1573
1574 // If F already has a body, reject this.
1575 if (!F-&gt;empty()) {
1576 ErrorF("redefinition of function");
1577 return 0;
1578 }
1579
1580 // If F took a different number of args, reject.
1581 if (F-&gt;arg_size() != Args.size()) {
1582 ErrorF("redefinition of function with different # args");
1583 return 0;
1584 }
1585 }
1586
1587 // Set names for all arguments.
1588 unsigned Idx = 0;
1589 for (Function::arg_iterator AI = F-&gt;arg_begin(); Idx != Args.size();
1590 ++AI, ++Idx) {
1591 AI-&gt;setName(Args[Idx]);
1592
1593 // Add arguments to variable symbol table.
1594 NamedValues[Args[Idx]] = AI;
1595 }
1596
1597 return F;
1598}
1599
1600Function *FunctionAST::Codegen() {
1601 NamedValues.clear();
1602
1603 Function *TheFunction = Proto-&gt;Codegen();
1604 if (TheFunction == 0)
1605 return 0;
1606
1607 // If this is an operator, install it.
1608 if (Proto-&gt;isBinaryOp())
1609 BinopPrecedence[Proto-&gt;getOperatorName()] = Proto-&gt;getBinaryPrecedence();
1610
1611 // Create a new basic block to start insertion into.
1612 BasicBlock *BB = new BasicBlock("entry", TheFunction);
1613 Builder.SetInsertPoint(BB);
1614
1615 if (Value *RetVal = Body-&gt;Codegen()) {
1616 // Finish off the function.
1617 Builder.CreateRet(RetVal);
1618
1619 // Validate the generated code, checking for consistency.
1620 verifyFunction(*TheFunction);
1621
1622 // Optimize the function.
1623 TheFPM-&gt;run(*TheFunction);
1624
1625 return TheFunction;
1626 }
1627
1628 // Error reading body, remove function.
1629 TheFunction-&gt;eraseFromParent();
1630
1631 if (Proto-&gt;isBinaryOp())
1632 BinopPrecedence.erase(Proto-&gt;getOperatorName());
1633 return 0;
1634}
1635
1636//===----------------------------------------------------------------------===//
1637// Top-Level parsing and JIT Driver
1638//===----------------------------------------------------------------------===//
1639
1640static ExecutionEngine *TheExecutionEngine;
1641
1642static void HandleDefinition() {
1643 if (FunctionAST *F = ParseDefinition()) {
1644 if (Function *LF = F-&gt;Codegen()) {
1645 fprintf(stderr, "Read function definition:");
1646 LF-&gt;dump();
1647 }
1648 } else {
1649 // Skip token for error recovery.
1650 getNextToken();
1651 }
1652}
1653
1654static void HandleExtern() {
1655 if (PrototypeAST *P = ParseExtern()) {
1656 if (Function *F = P-&gt;Codegen()) {
1657 fprintf(stderr, "Read extern: ");
1658 F-&gt;dump();
1659 }
1660 } else {
1661 // Skip token for error recovery.
1662 getNextToken();
1663 }
1664}
1665
1666static void HandleTopLevelExpression() {
1667 // Evaluate a top level expression into an anonymous function.
1668 if (FunctionAST *F = ParseTopLevelExpr()) {
1669 if (Function *LF = F-&gt;Codegen()) {
1670 // JIT the function, returning a function pointer.
1671 void *FPtr = TheExecutionEngine-&gt;getPointerToFunction(LF);
1672
1673 // Cast it to the right type (takes no arguments, returns a double) so we
1674 // can call it as a native function.
1675 double (*FP)() = (double (*)())FPtr;
1676 fprintf(stderr, "Evaluated to %f\n", FP());
1677 }
1678 } else {
1679 // Skip token for error recovery.
1680 getNextToken();
1681 }
1682}
1683
1684/// top ::= definition | external | expression | ';'
1685static void MainLoop() {
1686 while (1) {
1687 fprintf(stderr, "ready&gt; ");
1688 switch (CurTok) {
1689 case tok_eof: return;
1690 case ';': getNextToken(); break; // ignore top level semicolons.
1691 case tok_def: HandleDefinition(); break;
1692 case tok_extern: HandleExtern(); break;
1693 default: HandleTopLevelExpression(); break;
1694 }
1695 }
1696}
1697
1698
1699
1700//===----------------------------------------------------------------------===//
1701// "Library" functions that can be "extern'd" from user code.
1702//===----------------------------------------------------------------------===//
1703
1704/// putchard - putchar that takes a double and returns 0.
1705extern "C"
1706double putchard(double X) {
1707 putchar((char)X);
1708 return 0;
1709}
1710
1711/// printd - printf that takes a double prints it as "%f\n", returning 0.
1712extern "C"
1713double printd(double X) {
1714 printf("%f\n", X);
1715 return 0;
1716}
1717
1718//===----------------------------------------------------------------------===//
1719// Main driver code.
1720//===----------------------------------------------------------------------===//
1721
1722int main() {
1723 // Install standard binary operators.
1724 // 1 is lowest precedence.
1725 BinopPrecedence['&lt;'] = 10;
1726 BinopPrecedence['+'] = 20;
1727 BinopPrecedence['-'] = 20;
1728 BinopPrecedence['*'] = 40; // highest.
1729
1730 // Prime the first token.
1731 fprintf(stderr, "ready&gt; ");
1732 getNextToken();
1733
1734 // Make the module, which holds all the code.
1735 TheModule = new Module("my cool jit");
1736
1737 // Create the JIT.
1738 TheExecutionEngine = ExecutionEngine::create(TheModule);
1739
1740 {
1741 ExistingModuleProvider OurModuleProvider(TheModule);
1742 FunctionPassManager OurFPM(&amp;OurModuleProvider);
1743
1744 // Set up the optimizer pipeline. Start with registering info about how the
1745 // target lays out data structures.
1746 OurFPM.add(new TargetData(*TheExecutionEngine-&gt;getTargetData()));
1747 // Do simple "peephole" optimizations and bit-twiddling optzns.
1748 OurFPM.add(createInstructionCombiningPass());
1749 // Reassociate expressions.
1750 OurFPM.add(createReassociatePass());
1751 // Eliminate Common SubExpressions.
1752 OurFPM.add(createGVNPass());
1753 // Simplify the control flow graph (deleting unreachable blocks, etc).
1754 OurFPM.add(createCFGSimplificationPass());
1755 // Set the global so the code gen can use this.
1756 TheFPM = &amp;OurFPM;
1757
1758 // Run the main "interpreter loop" now.
1759 MainLoop();
1760
1761 TheFPM = 0;
1762 } // Free module provider and pass manager.
1763
1764
1765 // Print out all of the generated code.
1766 TheModule->dump();
1767 return 0;
1768}
Chris Lattnerc9d5d2c2007-11-01 06:49:54 +00001769</pre>
1770</div>
1771
1772</div>
1773
1774<!-- *********************************************************************** -->
1775<hr>
1776<address>
1777 <a href="http://jigsaw.w3.org/css-validator/check/referer"><img
1778 src="http://jigsaw.w3.org/css-validator/images/vcss" alt="Valid CSS!"></a>
1779 <a href="http://validator.w3.org/check/referer"><img
1780 src="http://www.w3.org/Icons/valid-html401" alt="Valid HTML 4.01!"></a>
1781
1782 <a href="mailto:sabre@nondot.org">Chris Lattner</a><br>
1783 <a href="http://llvm.org">The LLVM Compiler Infrastructure</a><br>
1784 Last modified: $Date: 2007-10-17 11:05:13 -0700 (Wed, 17 Oct 2007) $
1785</address>
1786</body>
1787</html>