blob: 978ba8e1cd67022bb6717518aaab48c32d3f53aa [file] [log] [blame]
Chris Lattnerc9d5d2c2007-11-01 06:49:54 +00001<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01//EN"
2 "http://www.w3.org/TR/html4/strict.dtd">
3
4<html>
5<head>
Chris Lattner58f2c872007-11-02 05:42:52 +00006 <title>Kaleidoscope: Extending the Language: User-defined Operators</title>
Chris Lattnerc9d5d2c2007-11-01 06:49:54 +00007 <meta http-equiv="Content-Type" content="text/html; charset=utf-8">
8 <meta name="author" content="Chris Lattner">
9 <link rel="stylesheet" href="../llvm.css" type="text/css">
10</head>
11
12<body>
13
Chris Lattner58f2c872007-11-02 05:42:52 +000014<div class="doc_title">Kaleidoscope: Extending the Language: User-defined Operators</div>
Chris Lattnerc9d5d2c2007-11-01 06:49:54 +000015
Chris Lattner128eb862007-11-05 19:06:59 +000016<ul>
17<li>Chapter 6
18 <ol>
19 <li><a href="#intro">Chapter 6 Introduction</a></li>
20 <li><a href="#idea">User-defined Operators: the Idea</a></li>
21 <li><a href="#binary">User-defined Binary Operators</a></li>
22 <li><a href="#unary">User-defined Unary Operators</a></li>
23 <li><a href="#example">Kicking the Tires</a></li>
24 <li><a href="#code">Full Code Listing</a></li>
25 </ol>
26</li>
27</ul>
28
Chris Lattnerc9d5d2c2007-11-01 06:49:54 +000029<div class="doc_author">
30 <p>Written by <a href="mailto:sabre@nondot.org">Chris Lattner</a></p>
31</div>
32
33<!-- *********************************************************************** -->
Chris Lattner128eb862007-11-05 19:06:59 +000034<div class="doc_section"><a name="intro">Chapter 6 Introduction</a></div>
Chris Lattnerc9d5d2c2007-11-01 06:49:54 +000035<!-- *********************************************************************** -->
36
37<div class="doc_text">
38
Chris Lattner128eb862007-11-05 19:06:59 +000039<p>Welcome to Chapter 6 of the "<a href="index.html">Implementing a language
40with LLVM</a>" tutorial. At this point in our tutorial, we now have a fully
Chris Lattnerc9d5d2c2007-11-01 06:49:54 +000041functional language that is fairly minimal, but also useful. One big problem
42with it though is that it doesn't have many useful operators (like division,
43logical negation, or even any comparisons other than less-than.</p>
44
Chris Lattner58f2c872007-11-02 05:42:52 +000045<p>This chapter of the tutorial takes a wild digression into adding user-defined
46operators to the simple and beautiful Kaleidoscope language, giving us a
Chris Lattnerc9d5d2c2007-11-01 06:49:54 +000047simple and ugly language in some ways, but also a powerful one at the same time.
48One of the great things about creating your own language is that you get to
49decide what is good or bad. In this tutorial we'll assume that it is okay and
50use this as a way to show some interesting parsing techniques.</p>
51
Chris Lattner58f2c872007-11-02 05:42:52 +000052<p>At the end of this tutorial, we'll <a href="#example">run through a nice
53little example</a> that shows an example application that you can build with
54Kaleidoscope and the feature set it now has.</p>
55
Chris Lattnerc9d5d2c2007-11-01 06:49:54 +000056</div>
57
58<!-- *********************************************************************** -->
Chris Lattner58f2c872007-11-02 05:42:52 +000059<div class="doc_section"><a name="idea">User-defined Operators: the Idea</a></div>
Chris Lattnerc9d5d2c2007-11-01 06:49:54 +000060<!-- *********************************************************************** -->
61
62<div class="doc_text">
63
64<p>
Chris Lattner58f2c872007-11-02 05:42:52 +000065The "operator overloading" that we will add to Kaleidoscope is more general than
Chris Lattnerc9d5d2c2007-11-01 06:49:54 +000066languages like C++. In C++, you are only allowed to redefine existing
67operators: you can't programatically change the grammar, introduce new
68operators, change precedence levels, etc. In this chapter, we will add this
69capability to Kaleidoscope, which will allow us to round out the set of
70operators that are supported, culminating in a more interesting example app.</p>
71
Chris Lattner58f2c872007-11-02 05:42:52 +000072<p>The point of going into user-defined operators in a tutorial like this is to
Chris Lattnerc9d5d2c2007-11-01 06:49:54 +000073show the power and flexibility of using a hand-written parser. The parser we
74are using so far is using recursive descent for most parts of the grammar, and
75operator precedence parsing for the expressions. See <a
76href="LangImpl2.html">Chapter 2</a> for details. Without using operator
77precedence parsing, it would be very difficult to allow the programmer to
78introduce new operators into the grammar: the grammar is dynamically extensible
79as the JIT runs.</p>
80
81<p>The two specific features we'll add are programmable unary operators (right
82now, Kaleidoscope has no unary operators at all) as well as binary operators.
83An example of this is:</p>
84
85<div class="doc_code">
86<pre>
87# Logical unary not.
88def unary!(v)
89 if v then
90 0
91 else
92 1;
93
94# Define &gt; with the same precedence as &lt;.
95def binary&gt; 10 (LHS RHS)
96 !(LHS &lt; RHS); # alternatively, could just use "RHS &lt; LHS"
97
98# Binary "logical or", (note that it does not "short circuit")
99def binary| 5 (LHS RHS)
100 if LHS then
101 1
102 else if RHS then
103 1
104 else
105 0;
106
107# Define = with slightly lower precedence than relationals.
108def binary= 9 (LHS RHS)
109 !(LHS &lt; RHS | LHS &gt; RHS);
110</pre>
111</div>
112
113<p>Many languages aspire to being able to implement their standard runtime
114library in the language itself. In Kaleidoscope, we can implement significant
115parts of the language in the library!</p>
116
117<p>We will break down implementation of these features into two parts:
Chris Lattner58f2c872007-11-02 05:42:52 +0000118implementing support for user-defined binary operators and adding unary
Chris Lattnerc9d5d2c2007-11-01 06:49:54 +0000119operators.</p>
120
121</div>
122
123<!-- *********************************************************************** -->
Chris Lattner58f2c872007-11-02 05:42:52 +0000124<div class="doc_section"><a name="binary">User-defined Binary Operators</a></div>
Chris Lattnerc9d5d2c2007-11-01 06:49:54 +0000125<!-- *********************************************************************** -->
126
127<div class="doc_text">
128
Chris Lattner58f2c872007-11-02 05:42:52 +0000129<p>Adding support for user-defined binary operators is pretty simple with our
Chris Lattnerc9d5d2c2007-11-01 06:49:54 +0000130current framework. We'll first add support for the unary/binary keywords:</p>
131
132<div class="doc_code">
133<pre>
134enum Token {
135 ...
136 <b>// operators
137 tok_binary = -11, tok_unary = -12</b>
138};
139...
140static int gettok() {
141...
142 if (IdentifierStr == "for") return tok_for;
143 if (IdentifierStr == "in") return tok_in;
144 <b>if (IdentifierStr == "binary") return tok_binary;
145 if (IdentifierStr == "unary") return tok_unary;</b>
146 return tok_identifier;
147</pre>
148</div>
149
150<p>This just adds lexer support for the unary and binary keywords, like we
151did in <a href="LangImpl5.html#iflexer">previous chapters</a>. One nice thing
152about our current AST is that we represent binary operators fully generally
153with their ASCII code as the opcode. For our extended operators, we'll use the
154same representation, so we don't need any new AST or parser support.</p>
155
156<p>On the other hand, we have to be able to represent the definitions of these
157new operators, in the "def binary| 5" part of the function definition. In the
158grammar so far, the "name" for the function definition is parsed as the
159"prototype" production and into the <tt>PrototypeAST</tt> AST node. To
160represent our new user-defined operators as prototypes, we have to extend
161the <tt>PrototypeAST</tt> AST node like this:</p>
162
163<div class="doc_code">
164<pre>
165/// PrototypeAST - This class represents the "prototype" for a function,
166/// which captures its argument names as well as if it is an operator.
167class PrototypeAST {
168 std::string Name;
169 std::vector&lt;std::string&gt; Args;
170 <b>bool isOperator;
171 unsigned Precedence; // Precedence if a binary op.</b>
172public:
173 PrototypeAST(const std::string &amp;name, const std::vector&lt;std::string&gt; &amp;args,
174 <b>bool isoperator = false, unsigned prec = 0</b>)
175 : Name(name), Args(args), <b>isOperator(isoperator), Precedence(prec)</b> {}
176
177 <b>bool isUnaryOp() const { return isOperator &amp;&amp; Args.size() == 1; }
178 bool isBinaryOp() const { return isOperator &amp;&amp; Args.size() == 2; }
179
180 char getOperatorName() const {
181 assert(isUnaryOp() || isBinaryOp());
182 return Name[Name.size()-1];
183 }
184
185 unsigned getBinaryPrecedence() const { return Precedence; }</b>
186
187 Function *Codegen();
188};
189</pre>
190</div>
191
192<p>Basically, in addition to knowing a name for the prototype, we now keep track
193of whether it was an operator, and if it was, what precedence level the operator
Chris Lattner58f2c872007-11-02 05:42:52 +0000194is at. The precedence is only used for binary operators (as you'll see below,
195it just doesn't apply for unary operators). Now that we have a way to represent
196the prototype for a user-defined operator, we need to parse it:</p>
197
198<div class="doc_code">
199<pre>
200/// prototype
201/// ::= id '(' id* ')'
202<b>/// ::= binary LETTER number? (id, id)</b>
203static PrototypeAST *ParsePrototype() {
204 std::string FnName;
205
206 <b>int Kind = 0; // 0 = identifier, 1 = unary, 2 = binary.
207 unsigned BinaryPrecedence = 30;</b>
208
209 switch (CurTok) {
210 default:
211 return ErrorP("Expected function name in prototype");
212 case tok_identifier:
213 FnName = IdentifierStr;
214 Kind = 0;
215 getNextToken();
216 break;
217 <b>case tok_binary:
218 getNextToken();
219 if (!isascii(CurTok))
220 return ErrorP("Expected binary operator");
221 FnName = "binary";
222 FnName += (char)CurTok;
223 Kind = 2;
224 getNextToken();
225
226 // Read the precedence if present.
227 if (CurTok == tok_number) {
228 if (NumVal &lt; 1 || NumVal &gt; 100)
229 return ErrorP("Invalid precedecnce: must be 1..100");
230 BinaryPrecedence = (unsigned)NumVal;
231 getNextToken();
232 }
233 break;</b>
234 }
235
236 if (CurTok != '(')
237 return ErrorP("Expected '(' in prototype");
238
239 std::vector&lt;std::string&gt; ArgNames;
240 while (getNextToken() == tok_identifier)
241 ArgNames.push_back(IdentifierStr);
242 if (CurTok != ')')
243 return ErrorP("Expected ')' in prototype");
244
245 // success.
246 getNextToken(); // eat ')'.
247
248 <b>// Verify right number of names for operator.
249 if (Kind &amp;&amp; ArgNames.size() != Kind)
250 return ErrorP("Invalid number of operands for operator");
251
252 return new PrototypeAST(FnName, ArgNames, Kind != 0, BinaryPrecedence);</b>
253}
254</pre>
255</div>
256
257<p>This is all fairly straight-forward parsing code, and we have already seen
258a lot of similar code in the past. One interesting piece of this is the part
259that sets up <tt>FnName</tt> for binary operators. This builds names like
260"binary@" for a newly defined "@" operator. This takes advantage of the fact
261that symbol names in the LLVM symbol table are allowed to have any character in
262them, inluding embedded nul characters.</p>
263
264<p>The next interesting piece is codegen support for these binary operators.
265Given our current structure, this is a simple addition of a default case for our
266existing binary operator node:</p>
267
268<div class="doc_code">
269<pre>
270Value *BinaryExprAST::Codegen() {
271 Value *L = LHS-&gt;Codegen();
272 Value *R = RHS-&gt;Codegen();
273 if (L == 0 || R == 0) return 0;
274
275 switch (Op) {
276 case '+': return Builder.CreateAdd(L, R, "addtmp");
277 case '-': return Builder.CreateSub(L, R, "subtmp");
278 case '*': return Builder.CreateMul(L, R, "multmp");
279 case '&lt;':
280 L = Builder.CreateFCmpULT(L, R, "multmp");
281 // Convert bool 0/1 to double 0.0 or 1.0
282 return Builder.CreateUIToFP(L, Type::DoubleTy, "booltmp");
283 <b>default: break;</b>
284 }
285
286 <b>// If it wasn't a builtin binary operator, it must be a user defined one. Emit
287 // a call to it.
288 Function *F = TheModule-&gt;getFunction(std::string("binary")+Op);
289 assert(F &amp;&amp; "binary operator not found!");
290
291 Value *Ops[] = { L, R };
292 return Builder.CreateCall(F, Ops, Ops+2, "binop");</b>
293}
294
295</pre>
296</div>
297
298<p>As you can see above, the new code is actually really simple. It just does
299a lookup for the appropriate operator in the symbol table and generates a
300function call to it. Since user-defined operators are just built as normal
301functions (because the "prototype" boils down into a function with the right
302name) everything falls into place.</p>
303
304<p>The final missing piece is a bit of top level magic, here:</p>
305
306<div class="doc_code">
307<pre>
308Function *FunctionAST::Codegen() {
309 NamedValues.clear();
310
311 Function *TheFunction = Proto->Codegen();
312 if (TheFunction == 0)
313 return 0;
314
315 <b>// If this is an operator, install it.
316 if (Proto-&gt;isBinaryOp())
317 BinopPrecedence[Proto->getOperatorName()] = Proto->getBinaryPrecedence();</b>
318
319 // Create a new basic block to start insertion into.
320 BasicBlock *BB = new BasicBlock("entry", TheFunction);
321 Builder.SetInsertPoint(BB);
322
323 if (Value *RetVal = Body-&gt;Codegen()) {
324 ...
325</pre>
326</div>
327
328<p>Basically, before codegening a function, if it is a user-defined operator, we
329register it in the precedence table. This allows the binary operator parsing
330logic we already have to handle it. Since it is a fully-general operator
331precedence parser, this is all we need to do to "extend the grammar".</p>
332
333<p>With that, we have useful user-defined binary operators. This builds a lot
334on the previous framework we built for other operators. Adding unary operators
335is a bit more challenging, because we don't have any framework for it yet, lets
336see what it takes.</p>
337
338</div>
339
340<!-- *********************************************************************** -->
341<div class="doc_section"><a name="unary">User-defined Unary Operators</a></div>
342<!-- *********************************************************************** -->
343
344<div class="doc_text">
345
346<p>Since we don't currently support unary operators in the Kaleidoscope
347langugage, we'll need to add everything for them. Above, we added simple
348support for the 'unary' keyword to the lexer. In addition to that, we need an
349AST node:</p>
350
351<div class="doc_code">
352<pre>
353/// UnaryExprAST - Expression class for a unary operator.
354class UnaryExprAST : public ExprAST {
355 char Opcode;
356 ExprAST *Operand;
357public:
358 UnaryExprAST(char opcode, ExprAST *operand)
359 : Opcode(opcode), Operand(operand) {}
360 virtual Value *Codegen();
361};
362</pre>
363</div>
364
365<p>This AST node is very simple and obvious by now. It directly mirrors the
366binary operator AST node, except that it only has one child. With this, we
367need to add the parsing logic. Parsing a unary operator is pretty simple: we'll
368add a new function to do it:</p>
369
370<div class="doc_code">
371<pre>
372/// unary
373/// ::= primary
374/// ::= '!' unary
375static ExprAST *ParseUnary() {
376 // If the current token is not an operator, it must be a primary expr.
377 if (!isascii(CurTok) || CurTok == '(' || CurTok == ',')
378 return ParsePrimary();
379
380 // If this is a unary operator, read it.
381 int Opc = CurTok;
382 getNextToken();
383 if (ExprAST *Operand = ParseUnary())
384 return new UnaryExprAST(Opc, Operand);
385 return 0;
386}
387</pre>
388</div>
389
390<p>The grammar we add is pretty straight-forward here. If we see a unary
391operator when parsing a primary operator, we eat the operator as a prefix and
392parse the remaining piece as another unary operator. This allows us to handle
393multiple unary operators (e.g. "!!x"). Note that unary operators can't have
394ambiguous parses like binary operators can, so there is no need for precedence
395information.</p>
396
397<p>The problem with the above is that we need to call ParseUnary from somewhere.
398To do this, we change previous callers of ParsePrimary to call ParseUnary
399instead:</p>
400
401<div class="doc_code">
402<pre>
403/// binoprhs
404/// ::= ('+' unary)*
405static ExprAST *ParseBinOpRHS(int ExprPrec, ExprAST *LHS) {
406 ...
407 <b>// Parse the unary expression after the binary operator.
408 ExprAST *RHS = ParseUnary();
409 if (!RHS) return 0;</b>
410 ...
411}
412/// expression
413/// ::= unary binoprhs
414///
415static ExprAST *ParseExpression() {
416 <b>ExprAST *LHS = ParseUnary();</b>
417 if (!LHS) return 0;
418
419 return ParseBinOpRHS(0, LHS);
420}
421</pre>
422</div>
423
424<p>With these two simple changes, we now parse unary operators and build the
425AST for them. Next up, we need to add parser support for prototypes, to parse
426the unary operator prototype. We extend the binary operator code above
427with:</p>
428
429<div class="doc_code">
430<pre>
431/// prototype
432/// ::= id '(' id* ')'
433/// ::= binary LETTER number? (id, id)
434<b>/// ::= unary LETTER (id)</b>
435static PrototypeAST *ParsePrototype() {
436 std::string FnName;
437
438 int Kind = 0; // 0 = identifier, 1 = unary, 2 = binary.
439 unsigned BinaryPrecedence = 30;
440
441 switch (CurTok) {
442 default:
443 return ErrorP("Expected function name in prototype");
444 case tok_identifier:
445 FnName = IdentifierStr;
446 Kind = 0;
447 getNextToken();
448 break;
449 <b>case tok_unary:
450 getNextToken();
451 if (!isascii(CurTok))
452 return ErrorP("Expected unary operator");
453 FnName = "unary";
454 FnName += (char)CurTok;
455 Kind = 1;
456 getNextToken();
457 break;</b>
458 case tok_binary:
459 ...
460</pre>
461</div>
462
463<p>As with binary operators, we name unary operators with a name that includes
464the operator character. This assists us at code generation time. Speaking of,
465the final piece we need to add is codegen support for unary operators. It looks
466like this:</p>
467
468<div class="doc_code">
469<pre>
470Value *UnaryExprAST::Codegen() {
471 Value *OperandV = Operand->Codegen();
472 if (OperandV == 0) return 0;
473
474 Function *F = TheModule->getFunction(std::string("unary")+Opcode);
475 if (F == 0)
476 return ErrorV("Unknown unary operator");
477
478 return Builder.CreateCall(F, OperandV, "unop");
479}
480</pre>
481</div>
482
483<p>This code is similar to, but simpler than, the code for binary operators. It
484is simpler primarily because it doesn't need to handle any predefined operators.
485</p>
486
487</div>
488
489<!-- *********************************************************************** -->
490<div class="doc_section"><a name="example">Kicking the Tires</a></div>
491<!-- *********************************************************************** -->
492
493<div class="doc_text">
494
495<p>It is somewhat hard to believe, but with a few simple extensions we've
Chris Lattnerb5d81b32007-11-02 05:54:25 +0000496covered in the last chapters, we have grown a real-ish language. With this, we
Chris Lattner58f2c872007-11-02 05:42:52 +0000497can do a lot of interesting things, including I/O, math, and a bunch of other
498things. For example, we can now add a nice sequencing operator (printd is
499defined to print out the specified value and a newline):</p>
500
501<div class="doc_code">
502<pre>
503ready&gt; <b>extern printd(x);</b>
504Read extern: declare double @printd(double)
505ready&gt; <b>def binary : 1 (x y) 0; # Low-precedence operator that ignores operands.</b>
506..
507ready&gt; <b>printd(123) : printd(456) : printd(789);</b>
508123.000000
509456.000000
510789.000000
511Evaluated to 0.000000
512</pre>
513</div>
514
Chris Lattnerb5d81b32007-11-02 05:54:25 +0000515<p>We can also define a bunch of other "primitive" operations, such as:</p>
Chris Lattner58f2c872007-11-02 05:42:52 +0000516
517<div class="doc_code">
518<pre>
519# Logical unary not.
520def unary!(v)
521 if v then
522 0
523 else
524 1;
525
526# Unary negate.
527def unary-(v)
528 0-v;
529
530# Define &gt; with the same precedence as &gt;. We could also easily define
531# &lt;= etc.
532def binary&gt; 10 (LHS RHS)
533 !(LHS &lt; RHS);
534
535# Binary logical or, which does not short circuit.
536def binary| 5 (LHS RHS)
537 if LHS then
538 1
539 else if RHS then
540 1
541 else
542 0;
543
544# Binary logical and, which does not short circuit.
545def binary&amp; 6 (LHS RHS)
546 if !LHS then
547 0
548 else
549 !!RHS;
550
551# Define = with slightly lower precedence than relationals.
552def binary = 9 (LHS RHS)
553 !(LHS &lt; RHS | LHS &gt; RHS);
554
555</pre>
556</div>
Chris Lattnerc9d5d2c2007-11-01 06:49:54 +0000557
558
Chris Lattner58f2c872007-11-02 05:42:52 +0000559<p>Given the previous if/then/else support, we can also define interesting
560functions for I/O. For example, the following prints out a character whose
561"density" reflects the value passed in: the lower the value, the denser the
562character:</p>
563
564<div class="doc_code">
565<pre>
566ready&gt;
567<b>
568extern putchard(char)
569def printdensity(d)
570 if d &gt; 8 then
571 putchard(32) # ' '
572 else if d &gt; 4 then
573 putchard(46) # '.'
574 else if d &gt; 2 then
575 putchard(43) # '+'
576 else
577 putchard(42); # '*'</b>
578...
579ready&gt; <b>printdensity(1): printdensity(2): printdensity(3) :
580 printdensity(4): printdensity(5): printdensity(9): putchard(10);</b>
581*++..
582Evaluated to 0.000000
583</pre>
584</div>
585
586<p>Based on these simple primitive operations, we can start to define more
587interesting things. For example, here's a little function that solves for the
588number of iterations it takes for a function in the complex plane to
589converge:</p>
590
591<div class="doc_code">
592<pre>
593# determine whether the specific location diverges.
594# Solve for z = z^2 + c in the complex plane.
595def mandleconverger(real imag iters creal cimag)
596 if iters &gt; 255 | (real*real + imag*imag &gt; 4) then
597 iters
598 else
599 mandleconverger(real*real - imag*imag + creal,
600 2*real*imag + cimag,
601 iters+1, creal, cimag);
602
603# return the number of iterations required for the iteration to escape
604def mandleconverge(real imag)
605 mandleconverger(real, imag, 0, real, imag);
606</pre>
607</div>
608
Chris Lattnerb5d81b32007-11-02 05:54:25 +0000609<p>This "z = z<sup>2</sup> + c" function is a beautiful little creature that is the basis
Chris Lattner58f2c872007-11-02 05:42:52 +0000610for computation of the <a
611href="http://en.wikipedia.org/wiki/Mandelbrot_set">Mandelbrot Set</a>. Our
612<tt>mandelconverge</tt> function returns the number of iterations that it takes
613for a complex orbit to escape, saturating to 255. This is not a very useful
614function by itself, but if you plot its value over a two-dimensional plane,
615you can see the mandelbrot set. Given that we are limited to using putchard
616here, our amazing graphical output is limited, but we can whip together
617something using the density plotter above:</p>
618
619<div class="doc_code">
620<pre>
621# compute and plot the mandlebrot set with the specified 2 dimentional range
622# info.
623def mandelhelp(xmin xmax xstep ymin ymax ystep)
624 for y = ymin, y &lt; ymax, ystep in (
625 (for x = xmin, x &lt; xmax, xstep in
626 printdensity(mandleconverge(x,y)))
627 : putchard(10)
628 )
629
630# mandel - This is a convenient helper function for ploting the mandelbrot set
631# from the specified position with the specified magnification.
632def mandel(realstart imagstart realmag imagmag)
633 mandelhelp(realstart, realstart+realmag*78, realmag,
634 imagstart, imagstart+imagmag*40, imagmag);
635</pre>
636</div>
637
638<p>Given this, we can try plotting out the mandlebrot set! Lets try it out:</p>
639
640<div class="doc_code">
641<pre>
642ready&gt; <b>mandel(-2.3, -1.3, 0.05, 0.07);</b>
643*******************************+++++++++++*************************************
644*************************+++++++++++++++++++++++*******************************
645**********************+++++++++++++++++++++++++++++****************************
646*******************+++++++++++++++++++++.. ...++++++++*************************
647*****************++++++++++++++++++++++.... ...+++++++++***********************
648***************+++++++++++++++++++++++..... ...+++++++++*********************
649**************+++++++++++++++++++++++.... ....+++++++++********************
650*************++++++++++++++++++++++...... .....++++++++*******************
651************+++++++++++++++++++++....... .......+++++++******************
652***********+++++++++++++++++++.... ... .+++++++*****************
653**********+++++++++++++++++....... .+++++++****************
654*********++++++++++++++........... ...+++++++***************
655********++++++++++++............ ...++++++++**************
656********++++++++++... .......... .++++++++**************
657*******+++++++++..... .+++++++++*************
658*******++++++++...... ..+++++++++*************
659*******++++++....... ..+++++++++*************
660*******+++++...... ..+++++++++*************
661*******.... .... ...+++++++++*************
662*******.... . ...+++++++++*************
663*******+++++...... ...+++++++++*************
664*******++++++....... ..+++++++++*************
665*******++++++++...... .+++++++++*************
666*******+++++++++..... ..+++++++++*************
667********++++++++++... .......... .++++++++**************
668********++++++++++++............ ...++++++++**************
669*********++++++++++++++.......... ...+++++++***************
670**********++++++++++++++++........ .+++++++****************
671**********++++++++++++++++++++.... ... ..+++++++****************
672***********++++++++++++++++++++++....... .......++++++++*****************
673************+++++++++++++++++++++++...... ......++++++++******************
674**************+++++++++++++++++++++++.... ....++++++++********************
675***************+++++++++++++++++++++++..... ...+++++++++*********************
676*****************++++++++++++++++++++++.... ...++++++++***********************
677*******************+++++++++++++++++++++......++++++++*************************
678*********************++++++++++++++++++++++.++++++++***************************
679*************************+++++++++++++++++++++++*******************************
680******************************+++++++++++++************************************
681*******************************************************************************
682*******************************************************************************
683*******************************************************************************
684Evaluated to 0.000000
685ready&gt; <b>mandel(-2, -1, 0.02, 0.04);</b>
686**************************+++++++++++++++++++++++++++++++++++++++++++++++++++++
687***********************++++++++++++++++++++++++++++++++++++++++++++++++++++++++
688*********************+++++++++++++++++++++++++++++++++++++++++++++++++++++++++.
689*******************+++++++++++++++++++++++++++++++++++++++++++++++++++++++++...
690*****************+++++++++++++++++++++++++++++++++++++++++++++++++++++++++.....
691***************++++++++++++++++++++++++++++++++++++++++++++++++++++++++........
692**************++++++++++++++++++++++++++++++++++++++++++++++++++++++...........
693************+++++++++++++++++++++++++++++++++++++++++++++++++++++..............
694***********++++++++++++++++++++++++++++++++++++++++++++++++++........ .
695**********++++++++++++++++++++++++++++++++++++++++++++++.............
696********+++++++++++++++++++++++++++++++++++++++++++..................
697*******+++++++++++++++++++++++++++++++++++++++.......................
698******+++++++++++++++++++++++++++++++++++...........................
699*****++++++++++++++++++++++++++++++++............................
700*****++++++++++++++++++++++++++++...............................
701****++++++++++++++++++++++++++...... .........................
702***++++++++++++++++++++++++......... ...... ...........
703***++++++++++++++++++++++............
704**+++++++++++++++++++++..............
705**+++++++++++++++++++................
706*++++++++++++++++++.................
707*++++++++++++++++............ ...
708*++++++++++++++..............
709*+++....++++................
710*.......... ...........
711*
712*.......... ...........
713*+++....++++................
714*++++++++++++++..............
715*++++++++++++++++............ ...
716*++++++++++++++++++.................
717**+++++++++++++++++++................
718**+++++++++++++++++++++..............
719***++++++++++++++++++++++............
720***++++++++++++++++++++++++......... ...... ...........
721****++++++++++++++++++++++++++...... .........................
722*****++++++++++++++++++++++++++++...............................
723*****++++++++++++++++++++++++++++++++............................
724******+++++++++++++++++++++++++++++++++++...........................
725*******+++++++++++++++++++++++++++++++++++++++.......................
726********+++++++++++++++++++++++++++++++++++++++++++..................
727Evaluated to 0.000000
728ready&gt; <b>mandel(-0.9, -1.4, 0.02, 0.03);</b>
729*******************************************************************************
730*******************************************************************************
731*******************************************************************************
732**********+++++++++++++++++++++************************************************
733*+++++++++++++++++++++++++++++++++++++++***************************************
734+++++++++++++++++++++++++++++++++++++++++++++**********************************
735++++++++++++++++++++++++++++++++++++++++++++++++++*****************************
736++++++++++++++++++++++++++++++++++++++++++++++++++++++*************************
737+++++++++++++++++++++++++++++++++++++++++++++++++++++++++**********************
738+++++++++++++++++++++++++++++++++.........++++++++++++++++++*******************
739+++++++++++++++++++++++++++++++.... ......+++++++++++++++++++****************
740+++++++++++++++++++++++++++++....... ........+++++++++++++++++++**************
741++++++++++++++++++++++++++++........ ........++++++++++++++++++++************
742+++++++++++++++++++++++++++......... .. ...+++++++++++++++++++++**********
743++++++++++++++++++++++++++........... ....++++++++++++++++++++++********
744++++++++++++++++++++++++............. .......++++++++++++++++++++++******
745+++++++++++++++++++++++............. ........+++++++++++++++++++++++****
746++++++++++++++++++++++........... ..........++++++++++++++++++++++***
747++++++++++++++++++++........... .........++++++++++++++++++++++*
748++++++++++++++++++............ ...........++++++++++++++++++++
749++++++++++++++++............... .............++++++++++++++++++
750++++++++++++++................. ...............++++++++++++++++
751++++++++++++.................. .................++++++++++++++
752+++++++++.................. .................+++++++++++++
753++++++........ . ......... ..++++++++++++
754++............ ...... ....++++++++++
755.............. ...++++++++++
756.............. ....+++++++++
757.............. .....++++++++
758............. ......++++++++
759........... .......++++++++
760......... ........+++++++
761......... ........+++++++
762......... ....+++++++
763........ ...+++++++
764....... ...+++++++
765 ....+++++++
766 .....+++++++
767 ....+++++++
768 ....+++++++
769 ....+++++++
770Evaluated to 0.000000
771ready&gt; <b>^D</b>
772</pre>
773</div>
774
775<p>At this point, you may be starting to realize that Kaleidoscope is a real
776and powerful language. It may not be self-similar :), but it can be used to
777plot things that are!</p>
778
779<p>With this, we conclude the "adding user-defined operators" chapter of the
780tutorial. We successfully extended our language with the ability to extend the
781language in the library, and showed how this can be used to build a simple but
782interesting end user application in Kaleidoscope. At this point, Kaleidoscope
783can build a variety of applications that are functional and can call functions
Chris Lattnerb5d81b32007-11-02 05:54:25 +0000784with side-effects, but it can't actually define and mutate a variable itself.
Chris Lattner58f2c872007-11-02 05:42:52 +0000785</p>
786
787<p>Strikingly, lack of this feature is an important limitation for some
788languages, and it is not at all obvious how to <a href="LangImpl7.html">add
789support for mutable variables</a> without having to add an "SSA construction"
790phase to your front-end. In the next chapter, we will describe how you can
791add this without building SSA in your front-end.</p>
Chris Lattnerc9d5d2c2007-11-01 06:49:54 +0000792
793</div>
794
795
796<!-- *********************************************************************** -->
797<div class="doc_section"><a name="code">Full Code Listing</a></div>
798<!-- *********************************************************************** -->
799
800<div class="doc_text">
801
802<p>
803Here is the complete code listing for our running example, enhanced with the
804if/then/else and for expressions.. To build this example, use:
805</p>
806
807<div class="doc_code">
808<pre>
809 # Compile
810 g++ -g toy.cpp `llvm-config --cppflags --ldflags --libs core jit native` -O3 -o toy
811 # Run
812 ./toy
813</pre>
814</div>
815
816<p>Here is the code:</p>
817
818<div class="doc_code">
819<pre>
Chris Lattner58f2c872007-11-02 05:42:52 +0000820#include "llvm/DerivedTypes.h"
821#include "llvm/ExecutionEngine/ExecutionEngine.h"
822#include "llvm/Module.h"
823#include "llvm/ModuleProvider.h"
824#include "llvm/PassManager.h"
825#include "llvm/Analysis/Verifier.h"
826#include "llvm/Target/TargetData.h"
827#include "llvm/Transforms/Scalar.h"
828#include "llvm/Support/LLVMBuilder.h"
829#include &lt;cstdio&gt;
830#include &lt;string&gt;
831#include &lt;map&gt;
832#include &lt;vector&gt;
833using namespace llvm;
834
835//===----------------------------------------------------------------------===//
836// Lexer
837//===----------------------------------------------------------------------===//
838
839// The lexer returns tokens [0-255] if it is an unknown character, otherwise one
840// of these for known things.
841enum Token {
842 tok_eof = -1,
843
844 // commands
845 tok_def = -2, tok_extern = -3,
846
847 // primary
848 tok_identifier = -4, tok_number = -5,
849
850 // control
851 tok_if = -6, tok_then = -7, tok_else = -8,
852 tok_for = -9, tok_in = -10,
853
854 // operators
855 tok_binary = -11, tok_unary = -12
856};
857
858static std::string IdentifierStr; // Filled in if tok_identifier
859static double NumVal; // Filled in if tok_number
860
861/// gettok - Return the next token from standard input.
862static int gettok() {
863 static int LastChar = ' ';
864
865 // Skip any whitespace.
866 while (isspace(LastChar))
867 LastChar = getchar();
868
869 if (isalpha(LastChar)) { // identifier: [a-zA-Z][a-zA-Z0-9]*
870 IdentifierStr = LastChar;
871 while (isalnum((LastChar = getchar())))
872 IdentifierStr += LastChar;
873
874 if (IdentifierStr == "def") return tok_def;
875 if (IdentifierStr == "extern") return tok_extern;
876 if (IdentifierStr == "if") return tok_if;
877 if (IdentifierStr == "then") return tok_then;
878 if (IdentifierStr == "else") return tok_else;
879 if (IdentifierStr == "for") return tok_for;
880 if (IdentifierStr == "in") return tok_in;
881 if (IdentifierStr == "binary") return tok_binary;
882 if (IdentifierStr == "unary") return tok_unary;
883 return tok_identifier;
884 }
885
886 if (isdigit(LastChar) || LastChar == '.') { // Number: [0-9.]+
887 std::string NumStr;
888 do {
889 NumStr += LastChar;
890 LastChar = getchar();
891 } while (isdigit(LastChar) || LastChar == '.');
892
893 NumVal = strtod(NumStr.c_str(), 0);
894 return tok_number;
895 }
896
897 if (LastChar == '#') {
898 // Comment until end of line.
899 do LastChar = getchar();
900 while (LastChar != EOF &amp;&amp; LastChar != '\n' &amp; LastChar != '\r');
901
902 if (LastChar != EOF)
903 return gettok();
904 }
905
906 // Check for end of file. Don't eat the EOF.
907 if (LastChar == EOF)
908 return tok_eof;
909
910 // Otherwise, just return the character as its ascii value.
911 int ThisChar = LastChar;
912 LastChar = getchar();
913 return ThisChar;
914}
915
916//===----------------------------------------------------------------------===//
917// Abstract Syntax Tree (aka Parse Tree)
918//===----------------------------------------------------------------------===//
919
920/// ExprAST - Base class for all expression nodes.
921class ExprAST {
922public:
923 virtual ~ExprAST() {}
924 virtual Value *Codegen() = 0;
925};
926
927/// NumberExprAST - Expression class for numeric literals like "1.0".
928class NumberExprAST : public ExprAST {
929 double Val;
930public:
931 NumberExprAST(double val) : Val(val) {}
932 virtual Value *Codegen();
933};
934
935/// VariableExprAST - Expression class for referencing a variable, like "a".
936class VariableExprAST : public ExprAST {
937 std::string Name;
938public:
939 VariableExprAST(const std::string &amp;name) : Name(name) {}
940 virtual Value *Codegen();
941};
942
943/// UnaryExprAST - Expression class for a unary operator.
944class UnaryExprAST : public ExprAST {
945 char Opcode;
946 ExprAST *Operand;
947public:
948 UnaryExprAST(char opcode, ExprAST *operand)
949 : Opcode(opcode), Operand(operand) {}
950 virtual Value *Codegen();
951};
952
953/// BinaryExprAST - Expression class for a binary operator.
954class BinaryExprAST : public ExprAST {
955 char Op;
956 ExprAST *LHS, *RHS;
957public:
958 BinaryExprAST(char op, ExprAST *lhs, ExprAST *rhs)
959 : Op(op), LHS(lhs), RHS(rhs) {}
960 virtual Value *Codegen();
961};
962
963/// CallExprAST - Expression class for function calls.
964class CallExprAST : public ExprAST {
965 std::string Callee;
966 std::vector&lt;ExprAST*&gt; Args;
967public:
968 CallExprAST(const std::string &amp;callee, std::vector&lt;ExprAST*&gt; &amp;args)
969 : Callee(callee), Args(args) {}
970 virtual Value *Codegen();
971};
972
973/// IfExprAST - Expression class for if/then/else.
974class IfExprAST : public ExprAST {
975 ExprAST *Cond, *Then, *Else;
976public:
977 IfExprAST(ExprAST *cond, ExprAST *then, ExprAST *_else)
978 : Cond(cond), Then(then), Else(_else) {}
979 virtual Value *Codegen();
980};
981
982/// ForExprAST - Expression class for for/in.
983class ForExprAST : public ExprAST {
984 std::string VarName;
985 ExprAST *Start, *End, *Step, *Body;
986public:
987 ForExprAST(const std::string &amp;varname, ExprAST *start, ExprAST *end,
988 ExprAST *step, ExprAST *body)
989 : VarName(varname), Start(start), End(end), Step(step), Body(body) {}
990 virtual Value *Codegen();
991};
992
993/// PrototypeAST - This class represents the "prototype" for a function,
994/// which captures its argument names as well as if it is an operator.
995class PrototypeAST {
996 std::string Name;
997 std::vector&lt;std::string&gt; Args;
998 bool isOperator;
999 unsigned Precedence; // Precedence if a binary op.
1000public:
1001 PrototypeAST(const std::string &amp;name, const std::vector&lt;std::string&gt; &amp;args,
1002 bool isoperator = false, unsigned prec = 0)
1003 : Name(name), Args(args), isOperator(isoperator), Precedence(prec) {}
1004
1005 bool isUnaryOp() const { return isOperator &amp;&amp; Args.size() == 1; }
1006 bool isBinaryOp() const { return isOperator &amp;&amp; Args.size() == 2; }
1007
1008 char getOperatorName() const {
1009 assert(isUnaryOp() || isBinaryOp());
1010 return Name[Name.size()-1];
1011 }
1012
1013 unsigned getBinaryPrecedence() const { return Precedence; }
1014
1015 Function *Codegen();
1016};
1017
1018/// FunctionAST - This class represents a function definition itself.
1019class FunctionAST {
1020 PrototypeAST *Proto;
1021 ExprAST *Body;
1022public:
1023 FunctionAST(PrototypeAST *proto, ExprAST *body)
1024 : Proto(proto), Body(body) {}
1025
1026 Function *Codegen();
1027};
1028
1029//===----------------------------------------------------------------------===//
1030// Parser
1031//===----------------------------------------------------------------------===//
1032
1033/// CurTok/getNextToken - Provide a simple token buffer. CurTok is the current
1034/// token the parser it looking at. getNextToken reads another token from the
1035/// lexer and updates CurTok with its results.
1036static int CurTok;
1037static int getNextToken() {
1038 return CurTok = gettok();
1039}
1040
1041/// BinopPrecedence - This holds the precedence for each binary operator that is
1042/// defined.
1043static std::map&lt;char, int&gt; BinopPrecedence;
1044
1045/// GetTokPrecedence - Get the precedence of the pending binary operator token.
1046static int GetTokPrecedence() {
1047 if (!isascii(CurTok))
1048 return -1;
1049
1050 // Make sure it's a declared binop.
1051 int TokPrec = BinopPrecedence[CurTok];
1052 if (TokPrec &lt;= 0) return -1;
1053 return TokPrec;
1054}
1055
1056/// Error* - These are little helper functions for error handling.
1057ExprAST *Error(const char *Str) { fprintf(stderr, "Error: %s\n", Str);return 0;}
1058PrototypeAST *ErrorP(const char *Str) { Error(Str); return 0; }
1059FunctionAST *ErrorF(const char *Str) { Error(Str); return 0; }
1060
1061static ExprAST *ParseExpression();
1062
1063/// identifierexpr
Chris Lattner20a0c802007-11-05 17:54:34 +00001064/// ::= identifier
1065/// ::= identifier '(' expression* ')'
Chris Lattner58f2c872007-11-02 05:42:52 +00001066static ExprAST *ParseIdentifierExpr() {
1067 std::string IdName = IdentifierStr;
1068
Chris Lattner20a0c802007-11-05 17:54:34 +00001069 getNextToken(); // eat identifier.
Chris Lattner58f2c872007-11-02 05:42:52 +00001070
1071 if (CurTok != '(') // Simple variable ref.
1072 return new VariableExprAST(IdName);
1073
1074 // Call.
1075 getNextToken(); // eat (
1076 std::vector&lt;ExprAST*&gt; Args;
1077 if (CurTok != ')') {
1078 while (1) {
1079 ExprAST *Arg = ParseExpression();
1080 if (!Arg) return 0;
1081 Args.push_back(Arg);
1082
1083 if (CurTok == ')') break;
1084
1085 if (CurTok != ',')
1086 return Error("Expected ')'");
1087 getNextToken();
1088 }
1089 }
1090
1091 // Eat the ')'.
1092 getNextToken();
1093
1094 return new CallExprAST(IdName, Args);
1095}
1096
1097/// numberexpr ::= number
1098static ExprAST *ParseNumberExpr() {
1099 ExprAST *Result = new NumberExprAST(NumVal);
1100 getNextToken(); // consume the number
1101 return Result;
1102}
1103
1104/// parenexpr ::= '(' expression ')'
1105static ExprAST *ParseParenExpr() {
1106 getNextToken(); // eat (.
1107 ExprAST *V = ParseExpression();
1108 if (!V) return 0;
1109
1110 if (CurTok != ')')
1111 return Error("expected ')'");
1112 getNextToken(); // eat ).
1113 return V;
1114}
1115
1116/// ifexpr ::= 'if' expression 'then' expression 'else' expression
1117static ExprAST *ParseIfExpr() {
1118 getNextToken(); // eat the if.
1119
1120 // condition.
1121 ExprAST *Cond = ParseExpression();
1122 if (!Cond) return 0;
1123
1124 if (CurTok != tok_then)
1125 return Error("expected then");
1126 getNextToken(); // eat the then
1127
1128 ExprAST *Then = ParseExpression();
1129 if (Then == 0) return 0;
1130
1131 if (CurTok != tok_else)
1132 return Error("expected else");
1133
1134 getNextToken();
1135
1136 ExprAST *Else = ParseExpression();
1137 if (!Else) return 0;
1138
1139 return new IfExprAST(Cond, Then, Else);
1140}
1141
Chris Lattner20a0c802007-11-05 17:54:34 +00001142/// forexpr ::= 'for' identifier '=' expr ',' expr (',' expr)? 'in' expression
Chris Lattner58f2c872007-11-02 05:42:52 +00001143static ExprAST *ParseForExpr() {
1144 getNextToken(); // eat the for.
1145
1146 if (CurTok != tok_identifier)
1147 return Error("expected identifier after for");
1148
1149 std::string IdName = IdentifierStr;
Chris Lattner20a0c802007-11-05 17:54:34 +00001150 getNextToken(); // eat identifier.
Chris Lattner58f2c872007-11-02 05:42:52 +00001151
1152 if (CurTok != '=')
1153 return Error("expected '=' after for");
1154 getNextToken(); // eat '='.
1155
1156
1157 ExprAST *Start = ParseExpression();
1158 if (Start == 0) return 0;
1159 if (CurTok != ',')
1160 return Error("expected ',' after for start value");
1161 getNextToken();
1162
1163 ExprAST *End = ParseExpression();
1164 if (End == 0) return 0;
1165
1166 // The step value is optional.
1167 ExprAST *Step = 0;
1168 if (CurTok == ',') {
1169 getNextToken();
1170 Step = ParseExpression();
1171 if (Step == 0) return 0;
1172 }
1173
1174 if (CurTok != tok_in)
1175 return Error("expected 'in' after for");
1176 getNextToken(); // eat 'in'.
1177
1178 ExprAST *Body = ParseExpression();
1179 if (Body == 0) return 0;
1180
1181 return new ForExprAST(IdName, Start, End, Step, Body);
1182}
1183
1184
1185/// primary
1186/// ::= identifierexpr
1187/// ::= numberexpr
1188/// ::= parenexpr
1189/// ::= ifexpr
1190/// ::= forexpr
1191static ExprAST *ParsePrimary() {
1192 switch (CurTok) {
1193 default: return Error("unknown token when expecting an expression");
1194 case tok_identifier: return ParseIdentifierExpr();
1195 case tok_number: return ParseNumberExpr();
1196 case '(': return ParseParenExpr();
1197 case tok_if: return ParseIfExpr();
1198 case tok_for: return ParseForExpr();
1199 }
1200}
1201
1202/// unary
1203/// ::= primary
1204/// ::= '!' unary
1205static ExprAST *ParseUnary() {
1206 // If the current token is not an operator, it must be a primary expr.
1207 if (!isascii(CurTok) || CurTok == '(' || CurTok == ',')
1208 return ParsePrimary();
1209
1210 // If this is a unary operator, read it.
1211 int Opc = CurTok;
1212 getNextToken();
1213 if (ExprAST *Operand = ParseUnary())
1214 return new UnaryExprAST(Opc, Operand);
1215 return 0;
1216}
1217
1218/// binoprhs
1219/// ::= ('+' unary)*
1220static ExprAST *ParseBinOpRHS(int ExprPrec, ExprAST *LHS) {
1221 // If this is a binop, find its precedence.
1222 while (1) {
1223 int TokPrec = GetTokPrecedence();
1224
1225 // If this is a binop that binds at least as tightly as the current binop,
1226 // consume it, otherwise we are done.
1227 if (TokPrec &lt; ExprPrec)
1228 return LHS;
1229
1230 // Okay, we know this is a binop.
1231 int BinOp = CurTok;
1232 getNextToken(); // eat binop
1233
1234 // Parse the unary expression after the binary operator.
1235 ExprAST *RHS = ParseUnary();
1236 if (!RHS) return 0;
1237
1238 // If BinOp binds less tightly with RHS than the operator after RHS, let
1239 // the pending operator take RHS as its LHS.
1240 int NextPrec = GetTokPrecedence();
1241 if (TokPrec &lt; NextPrec) {
1242 RHS = ParseBinOpRHS(TokPrec+1, RHS);
1243 if (RHS == 0) return 0;
1244 }
1245
1246 // Merge LHS/RHS.
1247 LHS = new BinaryExprAST(BinOp, LHS, RHS);
1248 }
1249}
1250
1251/// expression
1252/// ::= unary binoprhs
1253///
1254static ExprAST *ParseExpression() {
1255 ExprAST *LHS = ParseUnary();
1256 if (!LHS) return 0;
1257
1258 return ParseBinOpRHS(0, LHS);
1259}
1260
1261/// prototype
1262/// ::= id '(' id* ')'
1263/// ::= binary LETTER number? (id, id)
1264/// ::= unary LETTER (id)
1265static PrototypeAST *ParsePrototype() {
1266 std::string FnName;
1267
1268 int Kind = 0; // 0 = identifier, 1 = unary, 2 = binary.
1269 unsigned BinaryPrecedence = 30;
1270
1271 switch (CurTok) {
1272 default:
1273 return ErrorP("Expected function name in prototype");
1274 case tok_identifier:
1275 FnName = IdentifierStr;
1276 Kind = 0;
1277 getNextToken();
1278 break;
1279 case tok_unary:
1280 getNextToken();
1281 if (!isascii(CurTok))
1282 return ErrorP("Expected unary operator");
1283 FnName = "unary";
1284 FnName += (char)CurTok;
1285 Kind = 1;
1286 getNextToken();
1287 break;
1288 case tok_binary:
1289 getNextToken();
1290 if (!isascii(CurTok))
1291 return ErrorP("Expected binary operator");
1292 FnName = "binary";
1293 FnName += (char)CurTok;
1294 Kind = 2;
1295 getNextToken();
1296
1297 // Read the precedence if present.
1298 if (CurTok == tok_number) {
1299 if (NumVal &lt; 1 || NumVal &gt; 100)
1300 return ErrorP("Invalid precedecnce: must be 1..100");
1301 BinaryPrecedence = (unsigned)NumVal;
1302 getNextToken();
1303 }
1304 break;
1305 }
1306
1307 if (CurTok != '(')
1308 return ErrorP("Expected '(' in prototype");
1309
1310 std::vector&lt;std::string&gt; ArgNames;
1311 while (getNextToken() == tok_identifier)
1312 ArgNames.push_back(IdentifierStr);
1313 if (CurTok != ')')
1314 return ErrorP("Expected ')' in prototype");
1315
1316 // success.
1317 getNextToken(); // eat ')'.
1318
1319 // Verify right number of names for operator.
1320 if (Kind &amp;&amp; ArgNames.size() != Kind)
1321 return ErrorP("Invalid number of operands for operator");
1322
1323 return new PrototypeAST(FnName, ArgNames, Kind != 0, BinaryPrecedence);
1324}
1325
1326/// definition ::= 'def' prototype expression
1327static FunctionAST *ParseDefinition() {
1328 getNextToken(); // eat def.
1329 PrototypeAST *Proto = ParsePrototype();
1330 if (Proto == 0) return 0;
1331
1332 if (ExprAST *E = ParseExpression())
1333 return new FunctionAST(Proto, E);
1334 return 0;
1335}
1336
1337/// toplevelexpr ::= expression
1338static FunctionAST *ParseTopLevelExpr() {
1339 if (ExprAST *E = ParseExpression()) {
1340 // Make an anonymous proto.
1341 PrototypeAST *Proto = new PrototypeAST("", std::vector&lt;std::string&gt;());
1342 return new FunctionAST(Proto, E);
1343 }
1344 return 0;
1345}
1346
1347/// external ::= 'extern' prototype
1348static PrototypeAST *ParseExtern() {
1349 getNextToken(); // eat extern.
1350 return ParsePrototype();
1351}
1352
1353//===----------------------------------------------------------------------===//
1354// Code Generation
1355//===----------------------------------------------------------------------===//
1356
1357static Module *TheModule;
1358static LLVMFoldingBuilder Builder;
1359static std::map&lt;std::string, Value*&gt; NamedValues;
1360static FunctionPassManager *TheFPM;
1361
1362Value *ErrorV(const char *Str) { Error(Str); return 0; }
1363
1364Value *NumberExprAST::Codegen() {
1365 return ConstantFP::get(Type::DoubleTy, APFloat(Val));
1366}
1367
1368Value *VariableExprAST::Codegen() {
1369 // Look this variable up in the function.
1370 Value *V = NamedValues[Name];
1371 return V ? V : ErrorV("Unknown variable name");
1372}
1373
1374Value *UnaryExprAST::Codegen() {
1375 Value *OperandV = Operand-&gt;Codegen();
1376 if (OperandV == 0) return 0;
1377
1378 Function *F = TheModule-&gt;getFunction(std::string("unary")+Opcode);
1379 if (F == 0)
1380 return ErrorV("Unknown unary operator");
1381
1382 return Builder.CreateCall(F, OperandV, "unop");
1383}
1384
1385
1386Value *BinaryExprAST::Codegen() {
1387 Value *L = LHS-&gt;Codegen();
1388 Value *R = RHS-&gt;Codegen();
1389 if (L == 0 || R == 0) return 0;
1390
1391 switch (Op) {
1392 case '+': return Builder.CreateAdd(L, R, "addtmp");
1393 case '-': return Builder.CreateSub(L, R, "subtmp");
1394 case '*': return Builder.CreateMul(L, R, "multmp");
1395 case '&lt;':
1396 L = Builder.CreateFCmpULT(L, R, "multmp");
1397 // Convert bool 0/1 to double 0.0 or 1.0
1398 return Builder.CreateUIToFP(L, Type::DoubleTy, "booltmp");
1399 default: break;
1400 }
1401
1402 // If it wasn't a builtin binary operator, it must be a user defined one. Emit
1403 // a call to it.
1404 Function *F = TheModule-&gt;getFunction(std::string("binary")+Op);
1405 assert(F &amp;&amp; "binary operator not found!");
1406
1407 Value *Ops[] = { L, R };
1408 return Builder.CreateCall(F, Ops, Ops+2, "binop");
1409}
1410
1411Value *CallExprAST::Codegen() {
1412 // Look up the name in the global module table.
1413 Function *CalleeF = TheModule-&gt;getFunction(Callee);
1414 if (CalleeF == 0)
1415 return ErrorV("Unknown function referenced");
1416
1417 // If argument mismatch error.
1418 if (CalleeF-&gt;arg_size() != Args.size())
1419 return ErrorV("Incorrect # arguments passed");
1420
1421 std::vector&lt;Value*&gt; ArgsV;
1422 for (unsigned i = 0, e = Args.size(); i != e; ++i) {
1423 ArgsV.push_back(Args[i]-&gt;Codegen());
1424 if (ArgsV.back() == 0) return 0;
1425 }
1426
1427 return Builder.CreateCall(CalleeF, ArgsV.begin(), ArgsV.end(), "calltmp");
1428}
1429
1430Value *IfExprAST::Codegen() {
1431 Value *CondV = Cond-&gt;Codegen();
1432 if (CondV == 0) return 0;
1433
1434 // Convert condition to a bool by comparing equal to 0.0.
1435 CondV = Builder.CreateFCmpONE(CondV,
1436 ConstantFP::get(Type::DoubleTy, APFloat(0.0)),
1437 "ifcond");
1438
1439 Function *TheFunction = Builder.GetInsertBlock()-&gt;getParent();
1440
1441 // Create blocks for the then and else cases. Insert the 'then' block at the
1442 // end of the function.
1443 BasicBlock *ThenBB = new BasicBlock("then", TheFunction);
1444 BasicBlock *ElseBB = new BasicBlock("else");
1445 BasicBlock *MergeBB = new BasicBlock("ifcont");
1446
1447 Builder.CreateCondBr(CondV, ThenBB, ElseBB);
1448
1449 // Emit then value.
1450 Builder.SetInsertPoint(ThenBB);
1451
1452 Value *ThenV = Then-&gt;Codegen();
1453 if (ThenV == 0) return 0;
1454
1455 Builder.CreateBr(MergeBB);
1456 // Codegen of 'Then' can change the current block, update ThenBB for the PHI.
1457 ThenBB = Builder.GetInsertBlock();
1458
1459 // Emit else block.
1460 TheFunction-&gt;getBasicBlockList().push_back(ElseBB);
1461 Builder.SetInsertPoint(ElseBB);
1462
1463 Value *ElseV = Else-&gt;Codegen();
1464 if (ElseV == 0) return 0;
1465
1466 Builder.CreateBr(MergeBB);
1467 // Codegen of 'Else' can change the current block, update ElseBB for the PHI.
1468 ElseBB = Builder.GetInsertBlock();
1469
1470 // Emit merge block.
1471 TheFunction-&gt;getBasicBlockList().push_back(MergeBB);
1472 Builder.SetInsertPoint(MergeBB);
1473 PHINode *PN = Builder.CreatePHI(Type::DoubleTy, "iftmp");
1474
1475 PN-&gt;addIncoming(ThenV, ThenBB);
1476 PN-&gt;addIncoming(ElseV, ElseBB);
1477 return PN;
1478}
1479
1480Value *ForExprAST::Codegen() {
1481 // Output this as:
1482 // ...
1483 // start = startexpr
1484 // goto loop
1485 // loop:
1486 // variable = phi [start, loopheader], [nextvariable, loopend]
1487 // ...
1488 // bodyexpr
1489 // ...
1490 // loopend:
1491 // step = stepexpr
1492 // nextvariable = variable + step
1493 // endcond = endexpr
1494 // br endcond, loop, endloop
1495 // outloop:
1496
1497 // Emit the start code first, without 'variable' in scope.
1498 Value *StartVal = Start-&gt;Codegen();
1499 if (StartVal == 0) return 0;
1500
1501 // Make the new basic block for the loop header, inserting after current
1502 // block.
1503 Function *TheFunction = Builder.GetInsertBlock()-&gt;getParent();
1504 BasicBlock *PreheaderBB = Builder.GetInsertBlock();
1505 BasicBlock *LoopBB = new BasicBlock("loop", TheFunction);
1506
1507 // Insert an explicit fall through from the current block to the LoopBB.
1508 Builder.CreateBr(LoopBB);
1509
1510 // Start insertion in LoopBB.
1511 Builder.SetInsertPoint(LoopBB);
1512
1513 // Start the PHI node with an entry for Start.
1514 PHINode *Variable = Builder.CreatePHI(Type::DoubleTy, VarName.c_str());
1515 Variable-&gt;addIncoming(StartVal, PreheaderBB);
1516
1517 // Within the loop, the variable is defined equal to the PHI node. If it
1518 // shadows an existing variable, we have to restore it, so save it now.
1519 Value *OldVal = NamedValues[VarName];
1520 NamedValues[VarName] = Variable;
1521
1522 // Emit the body of the loop. This, like any other expr, can change the
1523 // current BB. Note that we ignore the value computed by the body, but don't
1524 // allow an error.
1525 if (Body-&gt;Codegen() == 0)
1526 return 0;
1527
1528 // Emit the step value.
1529 Value *StepVal;
1530 if (Step) {
1531 StepVal = Step-&gt;Codegen();
1532 if (StepVal == 0) return 0;
1533 } else {
1534 // If not specified, use 1.0.
1535 StepVal = ConstantFP::get(Type::DoubleTy, APFloat(1.0));
1536 }
1537
1538 Value *NextVar = Builder.CreateAdd(Variable, StepVal, "nextvar");
1539
1540 // Compute the end condition.
1541 Value *EndCond = End-&gt;Codegen();
1542 if (EndCond == 0) return EndCond;
1543
1544 // Convert condition to a bool by comparing equal to 0.0.
1545 EndCond = Builder.CreateFCmpONE(EndCond,
1546 ConstantFP::get(Type::DoubleTy, APFloat(0.0)),
1547 "loopcond");
1548
1549 // Create the "after loop" block and insert it.
1550 BasicBlock *LoopEndBB = Builder.GetInsertBlock();
1551 BasicBlock *AfterBB = new BasicBlock("afterloop", TheFunction);
1552
1553 // Insert the conditional branch into the end of LoopEndBB.
1554 Builder.CreateCondBr(EndCond, LoopBB, AfterBB);
1555
1556 // Any new code will be inserted in AfterBB.
1557 Builder.SetInsertPoint(AfterBB);
1558
1559 // Add a new entry to the PHI node for the backedge.
1560 Variable-&gt;addIncoming(NextVar, LoopEndBB);
1561
1562 // Restore the unshadowed variable.
1563 if (OldVal)
1564 NamedValues[VarName] = OldVal;
1565 else
1566 NamedValues.erase(VarName);
1567
1568
1569 // for expr always returns 0.0.
1570 return Constant::getNullValue(Type::DoubleTy);
1571}
1572
1573Function *PrototypeAST::Codegen() {
1574 // Make the function type: double(double,double) etc.
1575 std::vector&lt;const Type*&gt; Doubles(Args.size(), Type::DoubleTy);
1576 FunctionType *FT = FunctionType::get(Type::DoubleTy, Doubles, false);
1577
1578 Function *F = new Function(FT, Function::ExternalLinkage, Name, TheModule);
1579
1580 // If F conflicted, there was already something named 'Name'. If it has a
1581 // body, don't allow redefinition or reextern.
1582 if (F-&gt;getName() != Name) {
1583 // Delete the one we just made and get the existing one.
1584 F-&gt;eraseFromParent();
1585 F = TheModule-&gt;getFunction(Name);
1586
1587 // If F already has a body, reject this.
1588 if (!F-&gt;empty()) {
1589 ErrorF("redefinition of function");
1590 return 0;
1591 }
1592
1593 // If F took a different number of args, reject.
1594 if (F-&gt;arg_size() != Args.size()) {
1595 ErrorF("redefinition of function with different # args");
1596 return 0;
1597 }
1598 }
1599
1600 // Set names for all arguments.
1601 unsigned Idx = 0;
1602 for (Function::arg_iterator AI = F-&gt;arg_begin(); Idx != Args.size();
1603 ++AI, ++Idx) {
1604 AI-&gt;setName(Args[Idx]);
1605
1606 // Add arguments to variable symbol table.
1607 NamedValues[Args[Idx]] = AI;
1608 }
1609
1610 return F;
1611}
1612
1613Function *FunctionAST::Codegen() {
1614 NamedValues.clear();
1615
1616 Function *TheFunction = Proto-&gt;Codegen();
1617 if (TheFunction == 0)
1618 return 0;
1619
1620 // If this is an operator, install it.
1621 if (Proto-&gt;isBinaryOp())
1622 BinopPrecedence[Proto-&gt;getOperatorName()] = Proto-&gt;getBinaryPrecedence();
1623
1624 // Create a new basic block to start insertion into.
1625 BasicBlock *BB = new BasicBlock("entry", TheFunction);
1626 Builder.SetInsertPoint(BB);
1627
1628 if (Value *RetVal = Body-&gt;Codegen()) {
1629 // Finish off the function.
1630 Builder.CreateRet(RetVal);
1631
1632 // Validate the generated code, checking for consistency.
1633 verifyFunction(*TheFunction);
1634
1635 // Optimize the function.
1636 TheFPM-&gt;run(*TheFunction);
1637
1638 return TheFunction;
1639 }
1640
1641 // Error reading body, remove function.
1642 TheFunction-&gt;eraseFromParent();
1643
1644 if (Proto-&gt;isBinaryOp())
1645 BinopPrecedence.erase(Proto-&gt;getOperatorName());
1646 return 0;
1647}
1648
1649//===----------------------------------------------------------------------===//
1650// Top-Level parsing and JIT Driver
1651//===----------------------------------------------------------------------===//
1652
1653static ExecutionEngine *TheExecutionEngine;
1654
1655static void HandleDefinition() {
1656 if (FunctionAST *F = ParseDefinition()) {
1657 if (Function *LF = F-&gt;Codegen()) {
1658 fprintf(stderr, "Read function definition:");
1659 LF-&gt;dump();
1660 }
1661 } else {
1662 // Skip token for error recovery.
1663 getNextToken();
1664 }
1665}
1666
1667static void HandleExtern() {
1668 if (PrototypeAST *P = ParseExtern()) {
1669 if (Function *F = P-&gt;Codegen()) {
1670 fprintf(stderr, "Read extern: ");
1671 F-&gt;dump();
1672 }
1673 } else {
1674 // Skip token for error recovery.
1675 getNextToken();
1676 }
1677}
1678
1679static void HandleTopLevelExpression() {
1680 // Evaluate a top level expression into an anonymous function.
1681 if (FunctionAST *F = ParseTopLevelExpr()) {
1682 if (Function *LF = F-&gt;Codegen()) {
1683 // JIT the function, returning a function pointer.
1684 void *FPtr = TheExecutionEngine-&gt;getPointerToFunction(LF);
1685
1686 // Cast it to the right type (takes no arguments, returns a double) so we
1687 // can call it as a native function.
1688 double (*FP)() = (double (*)())FPtr;
1689 fprintf(stderr, "Evaluated to %f\n", FP());
1690 }
1691 } else {
1692 // Skip token for error recovery.
1693 getNextToken();
1694 }
1695}
1696
1697/// top ::= definition | external | expression | ';'
1698static void MainLoop() {
1699 while (1) {
1700 fprintf(stderr, "ready&gt; ");
1701 switch (CurTok) {
1702 case tok_eof: return;
1703 case ';': getNextToken(); break; // ignore top level semicolons.
1704 case tok_def: HandleDefinition(); break;
1705 case tok_extern: HandleExtern(); break;
1706 default: HandleTopLevelExpression(); break;
1707 }
1708 }
1709}
1710
1711
1712
1713//===----------------------------------------------------------------------===//
1714// "Library" functions that can be "extern'd" from user code.
1715//===----------------------------------------------------------------------===//
1716
1717/// putchard - putchar that takes a double and returns 0.
1718extern "C"
1719double putchard(double X) {
1720 putchar((char)X);
1721 return 0;
1722}
1723
1724/// printd - printf that takes a double prints it as "%f\n", returning 0.
1725extern "C"
1726double printd(double X) {
1727 printf("%f\n", X);
1728 return 0;
1729}
1730
1731//===----------------------------------------------------------------------===//
1732// Main driver code.
1733//===----------------------------------------------------------------------===//
1734
1735int main() {
1736 // Install standard binary operators.
1737 // 1 is lowest precedence.
1738 BinopPrecedence['&lt;'] = 10;
1739 BinopPrecedence['+'] = 20;
1740 BinopPrecedence['-'] = 20;
1741 BinopPrecedence['*'] = 40; // highest.
1742
1743 // Prime the first token.
1744 fprintf(stderr, "ready&gt; ");
1745 getNextToken();
1746
1747 // Make the module, which holds all the code.
1748 TheModule = new Module("my cool jit");
1749
1750 // Create the JIT.
1751 TheExecutionEngine = ExecutionEngine::create(TheModule);
1752
1753 {
1754 ExistingModuleProvider OurModuleProvider(TheModule);
1755 FunctionPassManager OurFPM(&amp;OurModuleProvider);
1756
1757 // Set up the optimizer pipeline. Start with registering info about how the
1758 // target lays out data structures.
1759 OurFPM.add(new TargetData(*TheExecutionEngine-&gt;getTargetData()));
1760 // Do simple "peephole" optimizations and bit-twiddling optzns.
1761 OurFPM.add(createInstructionCombiningPass());
1762 // Reassociate expressions.
1763 OurFPM.add(createReassociatePass());
1764 // Eliminate Common SubExpressions.
1765 OurFPM.add(createGVNPass());
1766 // Simplify the control flow graph (deleting unreachable blocks, etc).
1767 OurFPM.add(createCFGSimplificationPass());
1768 // Set the global so the code gen can use this.
1769 TheFPM = &amp;OurFPM;
1770
1771 // Run the main "interpreter loop" now.
1772 MainLoop();
1773
1774 TheFPM = 0;
1775 } // Free module provider and pass manager.
1776
1777
1778 // Print out all of the generated code.
1779 TheModule->dump();
1780 return 0;
1781}
Chris Lattnerc9d5d2c2007-11-01 06:49:54 +00001782</pre>
1783</div>
1784
1785</div>
1786
1787<!-- *********************************************************************** -->
1788<hr>
1789<address>
1790 <a href="http://jigsaw.w3.org/css-validator/check/referer"><img
1791 src="http://jigsaw.w3.org/css-validator/images/vcss" alt="Valid CSS!"></a>
1792 <a href="http://validator.w3.org/check/referer"><img
1793 src="http://www.w3.org/Icons/valid-html401" alt="Valid HTML 4.01!"></a>
1794
1795 <a href="mailto:sabre@nondot.org">Chris Lattner</a><br>
1796 <a href="http://llvm.org">The LLVM Compiler Infrastructure</a><br>
1797 Last modified: $Date: 2007-10-17 11:05:13 -0700 (Wed, 17 Oct 2007) $
1798</address>
1799</body>
1800</html>