blob: 8976acf50e748c1fe3b0c4d0801c56f1391692d4 [file] [log] [blame]
Chris Lattnerc9d5d2c2007-11-01 06:49:54 +00001<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01//EN"
2 "http://www.w3.org/TR/html4/strict.dtd">
3
4<html>
5<head>
Chris Lattner58f2c872007-11-02 05:42:52 +00006 <title>Kaleidoscope: Extending the Language: User-defined Operators</title>
Chris Lattnerc9d5d2c2007-11-01 06:49:54 +00007 <meta http-equiv="Content-Type" content="text/html; charset=utf-8">
8 <meta name="author" content="Chris Lattner">
9 <link rel="stylesheet" href="../llvm.css" type="text/css">
10</head>
11
12<body>
13
Chris Lattner58f2c872007-11-02 05:42:52 +000014<div class="doc_title">Kaleidoscope: Extending the Language: User-defined Operators</div>
Chris Lattnerc9d5d2c2007-11-01 06:49:54 +000015
Chris Lattner128eb862007-11-05 19:06:59 +000016<ul>
Chris Lattner0e555b12007-11-05 20:04:56 +000017<li><a href="index.html">Up to Tutorial Index</a></li>
Chris Lattner128eb862007-11-05 19:06:59 +000018<li>Chapter 6
19 <ol>
20 <li><a href="#intro">Chapter 6 Introduction</a></li>
21 <li><a href="#idea">User-defined Operators: the Idea</a></li>
22 <li><a href="#binary">User-defined Binary Operators</a></li>
23 <li><a href="#unary">User-defined Unary Operators</a></li>
24 <li><a href="#example">Kicking the Tires</a></li>
25 <li><a href="#code">Full Code Listing</a></li>
26 </ol>
27</li>
Chris Lattner0e555b12007-11-05 20:04:56 +000028<li><a href="LangImpl7.html">Chapter 7</a>: Extending the Language: Mutable
29Variables / SSA Construction</li>
Chris Lattner128eb862007-11-05 19:06:59 +000030</ul>
31
Chris Lattnerc9d5d2c2007-11-01 06:49:54 +000032<div class="doc_author">
33 <p>Written by <a href="mailto:sabre@nondot.org">Chris Lattner</a></p>
34</div>
35
36<!-- *********************************************************************** -->
Chris Lattner128eb862007-11-05 19:06:59 +000037<div class="doc_section"><a name="intro">Chapter 6 Introduction</a></div>
Chris Lattnerc9d5d2c2007-11-01 06:49:54 +000038<!-- *********************************************************************** -->
39
40<div class="doc_text">
41
Chris Lattner128eb862007-11-05 19:06:59 +000042<p>Welcome to Chapter 6 of the "<a href="index.html">Implementing a language
43with LLVM</a>" tutorial. At this point in our tutorial, we now have a fully
Chris Lattnerc9d5d2c2007-11-01 06:49:54 +000044functional language that is fairly minimal, but also useful. One big problem
45with it though is that it doesn't have many useful operators (like division,
46logical negation, or even any comparisons other than less-than.</p>
47
Chris Lattner58f2c872007-11-02 05:42:52 +000048<p>This chapter of the tutorial takes a wild digression into adding user-defined
49operators to the simple and beautiful Kaleidoscope language, giving us a
Chris Lattnerc9d5d2c2007-11-01 06:49:54 +000050simple and ugly language in some ways, but also a powerful one at the same time.
51One of the great things about creating your own language is that you get to
52decide what is good or bad. In this tutorial we'll assume that it is okay and
53use this as a way to show some interesting parsing techniques.</p>
54
Chris Lattner58f2c872007-11-02 05:42:52 +000055<p>At the end of this tutorial, we'll <a href="#example">run through a nice
56little example</a> that shows an example application that you can build with
57Kaleidoscope and the feature set it now has.</p>
58
Chris Lattnerc9d5d2c2007-11-01 06:49:54 +000059</div>
60
61<!-- *********************************************************************** -->
Chris Lattner58f2c872007-11-02 05:42:52 +000062<div class="doc_section"><a name="idea">User-defined Operators: the Idea</a></div>
Chris Lattnerc9d5d2c2007-11-01 06:49:54 +000063<!-- *********************************************************************** -->
64
65<div class="doc_text">
66
67<p>
Chris Lattner58f2c872007-11-02 05:42:52 +000068The "operator overloading" that we will add to Kaleidoscope is more general than
Chris Lattnerc9d5d2c2007-11-01 06:49:54 +000069languages like C++. In C++, you are only allowed to redefine existing
70operators: you can't programatically change the grammar, introduce new
71operators, change precedence levels, etc. In this chapter, we will add this
72capability to Kaleidoscope, which will allow us to round out the set of
73operators that are supported, culminating in a more interesting example app.</p>
74
Chris Lattner58f2c872007-11-02 05:42:52 +000075<p>The point of going into user-defined operators in a tutorial like this is to
Chris Lattnerc9d5d2c2007-11-01 06:49:54 +000076show the power and flexibility of using a hand-written parser. The parser we
77are using so far is using recursive descent for most parts of the grammar, and
78operator precedence parsing for the expressions. See <a
79href="LangImpl2.html">Chapter 2</a> for details. Without using operator
80precedence parsing, it would be very difficult to allow the programmer to
81introduce new operators into the grammar: the grammar is dynamically extensible
82as the JIT runs.</p>
83
84<p>The two specific features we'll add are programmable unary operators (right
85now, Kaleidoscope has no unary operators at all) as well as binary operators.
86An example of this is:</p>
87
88<div class="doc_code">
89<pre>
90# Logical unary not.
91def unary!(v)
92 if v then
93 0
94 else
95 1;
96
97# Define &gt; with the same precedence as &lt;.
98def binary&gt; 10 (LHS RHS)
99 !(LHS &lt; RHS); # alternatively, could just use "RHS &lt; LHS"
100
101# Binary "logical or", (note that it does not "short circuit")
102def binary| 5 (LHS RHS)
103 if LHS then
104 1
105 else if RHS then
106 1
107 else
108 0;
109
110# Define = with slightly lower precedence than relationals.
111def binary= 9 (LHS RHS)
112 !(LHS &lt; RHS | LHS &gt; RHS);
113</pre>
114</div>
115
116<p>Many languages aspire to being able to implement their standard runtime
117library in the language itself. In Kaleidoscope, we can implement significant
118parts of the language in the library!</p>
119
120<p>We will break down implementation of these features into two parts:
Chris Lattner58f2c872007-11-02 05:42:52 +0000121implementing support for user-defined binary operators and adding unary
Chris Lattnerc9d5d2c2007-11-01 06:49:54 +0000122operators.</p>
123
124</div>
125
126<!-- *********************************************************************** -->
Chris Lattner58f2c872007-11-02 05:42:52 +0000127<div class="doc_section"><a name="binary">User-defined Binary Operators</a></div>
Chris Lattnerc9d5d2c2007-11-01 06:49:54 +0000128<!-- *********************************************************************** -->
129
130<div class="doc_text">
131
Chris Lattner58f2c872007-11-02 05:42:52 +0000132<p>Adding support for user-defined binary operators is pretty simple with our
Chris Lattnerc9d5d2c2007-11-01 06:49:54 +0000133current framework. We'll first add support for the unary/binary keywords:</p>
134
135<div class="doc_code">
136<pre>
137enum Token {
138 ...
139 <b>// operators
140 tok_binary = -11, tok_unary = -12</b>
141};
142...
143static int gettok() {
144...
145 if (IdentifierStr == "for") return tok_for;
146 if (IdentifierStr == "in") return tok_in;
147 <b>if (IdentifierStr == "binary") return tok_binary;
148 if (IdentifierStr == "unary") return tok_unary;</b>
149 return tok_identifier;
150</pre>
151</div>
152
153<p>This just adds lexer support for the unary and binary keywords, like we
154did in <a href="LangImpl5.html#iflexer">previous chapters</a>. One nice thing
155about our current AST is that we represent binary operators fully generally
156with their ASCII code as the opcode. For our extended operators, we'll use the
157same representation, so we don't need any new AST or parser support.</p>
158
159<p>On the other hand, we have to be able to represent the definitions of these
160new operators, in the "def binary| 5" part of the function definition. In the
161grammar so far, the "name" for the function definition is parsed as the
162"prototype" production and into the <tt>PrototypeAST</tt> AST node. To
163represent our new user-defined operators as prototypes, we have to extend
164the <tt>PrototypeAST</tt> AST node like this:</p>
165
166<div class="doc_code">
167<pre>
168/// PrototypeAST - This class represents the "prototype" for a function,
169/// which captures its argument names as well as if it is an operator.
170class PrototypeAST {
171 std::string Name;
172 std::vector&lt;std::string&gt; Args;
173 <b>bool isOperator;
174 unsigned Precedence; // Precedence if a binary op.</b>
175public:
176 PrototypeAST(const std::string &amp;name, const std::vector&lt;std::string&gt; &amp;args,
177 <b>bool isoperator = false, unsigned prec = 0</b>)
178 : Name(name), Args(args), <b>isOperator(isoperator), Precedence(prec)</b> {}
179
180 <b>bool isUnaryOp() const { return isOperator &amp;&amp; Args.size() == 1; }
181 bool isBinaryOp() const { return isOperator &amp;&amp; Args.size() == 2; }
182
183 char getOperatorName() const {
184 assert(isUnaryOp() || isBinaryOp());
185 return Name[Name.size()-1];
186 }
187
188 unsigned getBinaryPrecedence() const { return Precedence; }</b>
189
190 Function *Codegen();
191};
192</pre>
193</div>
194
195<p>Basically, in addition to knowing a name for the prototype, we now keep track
196of whether it was an operator, and if it was, what precedence level the operator
Chris Lattner58f2c872007-11-02 05:42:52 +0000197is at. The precedence is only used for binary operators (as you'll see below,
198it just doesn't apply for unary operators). Now that we have a way to represent
199the prototype for a user-defined operator, we need to parse it:</p>
200
201<div class="doc_code">
202<pre>
203/// prototype
204/// ::= id '(' id* ')'
205<b>/// ::= binary LETTER number? (id, id)</b>
206static PrototypeAST *ParsePrototype() {
207 std::string FnName;
208
209 <b>int Kind = 0; // 0 = identifier, 1 = unary, 2 = binary.
210 unsigned BinaryPrecedence = 30;</b>
211
212 switch (CurTok) {
213 default:
214 return ErrorP("Expected function name in prototype");
215 case tok_identifier:
216 FnName = IdentifierStr;
217 Kind = 0;
218 getNextToken();
219 break;
220 <b>case tok_binary:
221 getNextToken();
222 if (!isascii(CurTok))
223 return ErrorP("Expected binary operator");
224 FnName = "binary";
225 FnName += (char)CurTok;
226 Kind = 2;
227 getNextToken();
228
229 // Read the precedence if present.
230 if (CurTok == tok_number) {
231 if (NumVal &lt; 1 || NumVal &gt; 100)
232 return ErrorP("Invalid precedecnce: must be 1..100");
233 BinaryPrecedence = (unsigned)NumVal;
234 getNextToken();
235 }
236 break;</b>
237 }
238
239 if (CurTok != '(')
240 return ErrorP("Expected '(' in prototype");
241
242 std::vector&lt;std::string&gt; ArgNames;
243 while (getNextToken() == tok_identifier)
244 ArgNames.push_back(IdentifierStr);
245 if (CurTok != ')')
246 return ErrorP("Expected ')' in prototype");
247
248 // success.
249 getNextToken(); // eat ')'.
250
251 <b>// Verify right number of names for operator.
252 if (Kind &amp;&amp; ArgNames.size() != Kind)
253 return ErrorP("Invalid number of operands for operator");
254
255 return new PrototypeAST(FnName, ArgNames, Kind != 0, BinaryPrecedence);</b>
256}
257</pre>
258</div>
259
260<p>This is all fairly straight-forward parsing code, and we have already seen
261a lot of similar code in the past. One interesting piece of this is the part
262that sets up <tt>FnName</tt> for binary operators. This builds names like
263"binary@" for a newly defined "@" operator. This takes advantage of the fact
264that symbol names in the LLVM symbol table are allowed to have any character in
265them, inluding embedded nul characters.</p>
266
267<p>The next interesting piece is codegen support for these binary operators.
268Given our current structure, this is a simple addition of a default case for our
269existing binary operator node:</p>
270
271<div class="doc_code">
272<pre>
273Value *BinaryExprAST::Codegen() {
274 Value *L = LHS-&gt;Codegen();
275 Value *R = RHS-&gt;Codegen();
276 if (L == 0 || R == 0) return 0;
277
278 switch (Op) {
279 case '+': return Builder.CreateAdd(L, R, "addtmp");
280 case '-': return Builder.CreateSub(L, R, "subtmp");
281 case '*': return Builder.CreateMul(L, R, "multmp");
282 case '&lt;':
Chris Lattner71155212007-11-06 01:39:12 +0000283 L = Builder.CreateFCmpULT(L, R, "cmptmp");
Chris Lattner58f2c872007-11-02 05:42:52 +0000284 // Convert bool 0/1 to double 0.0 or 1.0
285 return Builder.CreateUIToFP(L, Type::DoubleTy, "booltmp");
286 <b>default: break;</b>
287 }
288
289 <b>// If it wasn't a builtin binary operator, it must be a user defined one. Emit
290 // a call to it.
291 Function *F = TheModule-&gt;getFunction(std::string("binary")+Op);
292 assert(F &amp;&amp; "binary operator not found!");
293
294 Value *Ops[] = { L, R };
295 return Builder.CreateCall(F, Ops, Ops+2, "binop");</b>
296}
297
298</pre>
299</div>
300
301<p>As you can see above, the new code is actually really simple. It just does
302a lookup for the appropriate operator in the symbol table and generates a
303function call to it. Since user-defined operators are just built as normal
304functions (because the "prototype" boils down into a function with the right
305name) everything falls into place.</p>
306
307<p>The final missing piece is a bit of top level magic, here:</p>
308
309<div class="doc_code">
310<pre>
311Function *FunctionAST::Codegen() {
312 NamedValues.clear();
313
314 Function *TheFunction = Proto->Codegen();
315 if (TheFunction == 0)
316 return 0;
317
318 <b>// If this is an operator, install it.
319 if (Proto-&gt;isBinaryOp())
320 BinopPrecedence[Proto->getOperatorName()] = Proto->getBinaryPrecedence();</b>
321
322 // Create a new basic block to start insertion into.
323 BasicBlock *BB = new BasicBlock("entry", TheFunction);
324 Builder.SetInsertPoint(BB);
325
326 if (Value *RetVal = Body-&gt;Codegen()) {
327 ...
328</pre>
329</div>
330
331<p>Basically, before codegening a function, if it is a user-defined operator, we
332register it in the precedence table. This allows the binary operator parsing
333logic we already have to handle it. Since it is a fully-general operator
334precedence parser, this is all we need to do to "extend the grammar".</p>
335
336<p>With that, we have useful user-defined binary operators. This builds a lot
337on the previous framework we built for other operators. Adding unary operators
338is a bit more challenging, because we don't have any framework for it yet, lets
339see what it takes.</p>
340
341</div>
342
343<!-- *********************************************************************** -->
344<div class="doc_section"><a name="unary">User-defined Unary Operators</a></div>
345<!-- *********************************************************************** -->
346
347<div class="doc_text">
348
349<p>Since we don't currently support unary operators in the Kaleidoscope
350langugage, we'll need to add everything for them. Above, we added simple
351support for the 'unary' keyword to the lexer. In addition to that, we need an
352AST node:</p>
353
354<div class="doc_code">
355<pre>
356/// UnaryExprAST - Expression class for a unary operator.
357class UnaryExprAST : public ExprAST {
358 char Opcode;
359 ExprAST *Operand;
360public:
361 UnaryExprAST(char opcode, ExprAST *operand)
362 : Opcode(opcode), Operand(operand) {}
363 virtual Value *Codegen();
364};
365</pre>
366</div>
367
368<p>This AST node is very simple and obvious by now. It directly mirrors the
369binary operator AST node, except that it only has one child. With this, we
370need to add the parsing logic. Parsing a unary operator is pretty simple: we'll
371add a new function to do it:</p>
372
373<div class="doc_code">
374<pre>
375/// unary
376/// ::= primary
377/// ::= '!' unary
378static ExprAST *ParseUnary() {
379 // If the current token is not an operator, it must be a primary expr.
380 if (!isascii(CurTok) || CurTok == '(' || CurTok == ',')
381 return ParsePrimary();
382
383 // If this is a unary operator, read it.
384 int Opc = CurTok;
385 getNextToken();
386 if (ExprAST *Operand = ParseUnary())
387 return new UnaryExprAST(Opc, Operand);
388 return 0;
389}
390</pre>
391</div>
392
393<p>The grammar we add is pretty straight-forward here. If we see a unary
394operator when parsing a primary operator, we eat the operator as a prefix and
395parse the remaining piece as another unary operator. This allows us to handle
396multiple unary operators (e.g. "!!x"). Note that unary operators can't have
397ambiguous parses like binary operators can, so there is no need for precedence
398information.</p>
399
400<p>The problem with the above is that we need to call ParseUnary from somewhere.
401To do this, we change previous callers of ParsePrimary to call ParseUnary
402instead:</p>
403
404<div class="doc_code">
405<pre>
406/// binoprhs
407/// ::= ('+' unary)*
408static ExprAST *ParseBinOpRHS(int ExprPrec, ExprAST *LHS) {
409 ...
410 <b>// Parse the unary expression after the binary operator.
411 ExprAST *RHS = ParseUnary();
412 if (!RHS) return 0;</b>
413 ...
414}
415/// expression
416/// ::= unary binoprhs
417///
418static ExprAST *ParseExpression() {
419 <b>ExprAST *LHS = ParseUnary();</b>
420 if (!LHS) return 0;
421
422 return ParseBinOpRHS(0, LHS);
423}
424</pre>
425</div>
426
427<p>With these two simple changes, we now parse unary operators and build the
428AST for them. Next up, we need to add parser support for prototypes, to parse
429the unary operator prototype. We extend the binary operator code above
430with:</p>
431
432<div class="doc_code">
433<pre>
434/// prototype
435/// ::= id '(' id* ')'
436/// ::= binary LETTER number? (id, id)
437<b>/// ::= unary LETTER (id)</b>
438static PrototypeAST *ParsePrototype() {
439 std::string FnName;
440
441 int Kind = 0; // 0 = identifier, 1 = unary, 2 = binary.
442 unsigned BinaryPrecedence = 30;
443
444 switch (CurTok) {
445 default:
446 return ErrorP("Expected function name in prototype");
447 case tok_identifier:
448 FnName = IdentifierStr;
449 Kind = 0;
450 getNextToken();
451 break;
452 <b>case tok_unary:
453 getNextToken();
454 if (!isascii(CurTok))
455 return ErrorP("Expected unary operator");
456 FnName = "unary";
457 FnName += (char)CurTok;
458 Kind = 1;
459 getNextToken();
460 break;</b>
461 case tok_binary:
462 ...
463</pre>
464</div>
465
466<p>As with binary operators, we name unary operators with a name that includes
467the operator character. This assists us at code generation time. Speaking of,
468the final piece we need to add is codegen support for unary operators. It looks
469like this:</p>
470
471<div class="doc_code">
472<pre>
473Value *UnaryExprAST::Codegen() {
474 Value *OperandV = Operand->Codegen();
475 if (OperandV == 0) return 0;
476
477 Function *F = TheModule->getFunction(std::string("unary")+Opcode);
478 if (F == 0)
479 return ErrorV("Unknown unary operator");
480
481 return Builder.CreateCall(F, OperandV, "unop");
482}
483</pre>
484</div>
485
486<p>This code is similar to, but simpler than, the code for binary operators. It
487is simpler primarily because it doesn't need to handle any predefined operators.
488</p>
489
490</div>
491
492<!-- *********************************************************************** -->
493<div class="doc_section"><a name="example">Kicking the Tires</a></div>
494<!-- *********************************************************************** -->
495
496<div class="doc_text">
497
498<p>It is somewhat hard to believe, but with a few simple extensions we've
Chris Lattnerb5d81b32007-11-02 05:54:25 +0000499covered in the last chapters, we have grown a real-ish language. With this, we
Chris Lattner58f2c872007-11-02 05:42:52 +0000500can do a lot of interesting things, including I/O, math, and a bunch of other
501things. For example, we can now add a nice sequencing operator (printd is
502defined to print out the specified value and a newline):</p>
503
504<div class="doc_code">
505<pre>
506ready&gt; <b>extern printd(x);</b>
507Read extern: declare double @printd(double)
508ready&gt; <b>def binary : 1 (x y) 0; # Low-precedence operator that ignores operands.</b>
509..
510ready&gt; <b>printd(123) : printd(456) : printd(789);</b>
511123.000000
512456.000000
513789.000000
514Evaluated to 0.000000
515</pre>
516</div>
517
Chris Lattnerb5d81b32007-11-02 05:54:25 +0000518<p>We can also define a bunch of other "primitive" operations, such as:</p>
Chris Lattner58f2c872007-11-02 05:42:52 +0000519
520<div class="doc_code">
521<pre>
522# Logical unary not.
523def unary!(v)
524 if v then
525 0
526 else
527 1;
528
529# Unary negate.
530def unary-(v)
531 0-v;
532
533# Define &gt; with the same precedence as &gt;. We could also easily define
534# &lt;= etc.
535def binary&gt; 10 (LHS RHS)
536 !(LHS &lt; RHS);
537
538# Binary logical or, which does not short circuit.
539def binary| 5 (LHS RHS)
540 if LHS then
541 1
542 else if RHS then
543 1
544 else
545 0;
546
547# Binary logical and, which does not short circuit.
548def binary&amp; 6 (LHS RHS)
549 if !LHS then
550 0
551 else
552 !!RHS;
553
554# Define = with slightly lower precedence than relationals.
555def binary = 9 (LHS RHS)
556 !(LHS &lt; RHS | LHS &gt; RHS);
557
558</pre>
559</div>
Chris Lattnerc9d5d2c2007-11-01 06:49:54 +0000560
561
Chris Lattner58f2c872007-11-02 05:42:52 +0000562<p>Given the previous if/then/else support, we can also define interesting
563functions for I/O. For example, the following prints out a character whose
564"density" reflects the value passed in: the lower the value, the denser the
565character:</p>
566
567<div class="doc_code">
568<pre>
569ready&gt;
570<b>
571extern putchard(char)
572def printdensity(d)
573 if d &gt; 8 then
574 putchard(32) # ' '
575 else if d &gt; 4 then
576 putchard(46) # '.'
577 else if d &gt; 2 then
578 putchard(43) # '+'
579 else
580 putchard(42); # '*'</b>
581...
582ready&gt; <b>printdensity(1): printdensity(2): printdensity(3) :
583 printdensity(4): printdensity(5): printdensity(9): putchard(10);</b>
584*++..
585Evaluated to 0.000000
586</pre>
587</div>
588
589<p>Based on these simple primitive operations, we can start to define more
590interesting things. For example, here's a little function that solves for the
591number of iterations it takes for a function in the complex plane to
592converge:</p>
593
594<div class="doc_code">
595<pre>
596# determine whether the specific location diverges.
597# Solve for z = z^2 + c in the complex plane.
598def mandleconverger(real imag iters creal cimag)
599 if iters &gt; 255 | (real*real + imag*imag &gt; 4) then
600 iters
601 else
602 mandleconverger(real*real - imag*imag + creal,
603 2*real*imag + cimag,
604 iters+1, creal, cimag);
605
606# return the number of iterations required for the iteration to escape
607def mandleconverge(real imag)
608 mandleconverger(real, imag, 0, real, imag);
609</pre>
610</div>
611
Chris Lattnerb5d81b32007-11-02 05:54:25 +0000612<p>This "z = z<sup>2</sup> + c" function is a beautiful little creature that is the basis
Chris Lattner58f2c872007-11-02 05:42:52 +0000613for computation of the <a
614href="http://en.wikipedia.org/wiki/Mandelbrot_set">Mandelbrot Set</a>. Our
615<tt>mandelconverge</tt> function returns the number of iterations that it takes
616for a complex orbit to escape, saturating to 255. This is not a very useful
617function by itself, but if you plot its value over a two-dimensional plane,
618you can see the mandelbrot set. Given that we are limited to using putchard
619here, our amazing graphical output is limited, but we can whip together
620something using the density plotter above:</p>
621
622<div class="doc_code">
623<pre>
624# compute and plot the mandlebrot set with the specified 2 dimentional range
625# info.
626def mandelhelp(xmin xmax xstep ymin ymax ystep)
627 for y = ymin, y &lt; ymax, ystep in (
628 (for x = xmin, x &lt; xmax, xstep in
629 printdensity(mandleconverge(x,y)))
630 : putchard(10)
631 )
632
633# mandel - This is a convenient helper function for ploting the mandelbrot set
634# from the specified position with the specified magnification.
635def mandel(realstart imagstart realmag imagmag)
636 mandelhelp(realstart, realstart+realmag*78, realmag,
637 imagstart, imagstart+imagmag*40, imagmag);
638</pre>
639</div>
640
641<p>Given this, we can try plotting out the mandlebrot set! Lets try it out:</p>
642
643<div class="doc_code">
644<pre>
645ready&gt; <b>mandel(-2.3, -1.3, 0.05, 0.07);</b>
646*******************************+++++++++++*************************************
647*************************+++++++++++++++++++++++*******************************
648**********************+++++++++++++++++++++++++++++****************************
649*******************+++++++++++++++++++++.. ...++++++++*************************
650*****************++++++++++++++++++++++.... ...+++++++++***********************
651***************+++++++++++++++++++++++..... ...+++++++++*********************
652**************+++++++++++++++++++++++.... ....+++++++++********************
653*************++++++++++++++++++++++...... .....++++++++*******************
654************+++++++++++++++++++++....... .......+++++++******************
655***********+++++++++++++++++++.... ... .+++++++*****************
656**********+++++++++++++++++....... .+++++++****************
657*********++++++++++++++........... ...+++++++***************
658********++++++++++++............ ...++++++++**************
659********++++++++++... .......... .++++++++**************
660*******+++++++++..... .+++++++++*************
661*******++++++++...... ..+++++++++*************
662*******++++++....... ..+++++++++*************
663*******+++++...... ..+++++++++*************
664*******.... .... ...+++++++++*************
665*******.... . ...+++++++++*************
666*******+++++...... ...+++++++++*************
667*******++++++....... ..+++++++++*************
668*******++++++++...... .+++++++++*************
669*******+++++++++..... ..+++++++++*************
670********++++++++++... .......... .++++++++**************
671********++++++++++++............ ...++++++++**************
672*********++++++++++++++.......... ...+++++++***************
673**********++++++++++++++++........ .+++++++****************
674**********++++++++++++++++++++.... ... ..+++++++****************
675***********++++++++++++++++++++++....... .......++++++++*****************
676************+++++++++++++++++++++++...... ......++++++++******************
677**************+++++++++++++++++++++++.... ....++++++++********************
678***************+++++++++++++++++++++++..... ...+++++++++*********************
679*****************++++++++++++++++++++++.... ...++++++++***********************
680*******************+++++++++++++++++++++......++++++++*************************
681*********************++++++++++++++++++++++.++++++++***************************
682*************************+++++++++++++++++++++++*******************************
683******************************+++++++++++++************************************
684*******************************************************************************
685*******************************************************************************
686*******************************************************************************
687Evaluated to 0.000000
688ready&gt; <b>mandel(-2, -1, 0.02, 0.04);</b>
689**************************+++++++++++++++++++++++++++++++++++++++++++++++++++++
690***********************++++++++++++++++++++++++++++++++++++++++++++++++++++++++
691*********************+++++++++++++++++++++++++++++++++++++++++++++++++++++++++.
692*******************+++++++++++++++++++++++++++++++++++++++++++++++++++++++++...
693*****************+++++++++++++++++++++++++++++++++++++++++++++++++++++++++.....
694***************++++++++++++++++++++++++++++++++++++++++++++++++++++++++........
695**************++++++++++++++++++++++++++++++++++++++++++++++++++++++...........
696************+++++++++++++++++++++++++++++++++++++++++++++++++++++..............
697***********++++++++++++++++++++++++++++++++++++++++++++++++++........ .
698**********++++++++++++++++++++++++++++++++++++++++++++++.............
699********+++++++++++++++++++++++++++++++++++++++++++..................
700*******+++++++++++++++++++++++++++++++++++++++.......................
701******+++++++++++++++++++++++++++++++++++...........................
702*****++++++++++++++++++++++++++++++++............................
703*****++++++++++++++++++++++++++++...............................
704****++++++++++++++++++++++++++...... .........................
705***++++++++++++++++++++++++......... ...... ...........
706***++++++++++++++++++++++............
707**+++++++++++++++++++++..............
708**+++++++++++++++++++................
709*++++++++++++++++++.................
710*++++++++++++++++............ ...
711*++++++++++++++..............
712*+++....++++................
713*.......... ...........
714*
715*.......... ...........
716*+++....++++................
717*++++++++++++++..............
718*++++++++++++++++............ ...
719*++++++++++++++++++.................
720**+++++++++++++++++++................
721**+++++++++++++++++++++..............
722***++++++++++++++++++++++............
723***++++++++++++++++++++++++......... ...... ...........
724****++++++++++++++++++++++++++...... .........................
725*****++++++++++++++++++++++++++++...............................
726*****++++++++++++++++++++++++++++++++............................
727******+++++++++++++++++++++++++++++++++++...........................
728*******+++++++++++++++++++++++++++++++++++++++.......................
729********+++++++++++++++++++++++++++++++++++++++++++..................
730Evaluated to 0.000000
731ready&gt; <b>mandel(-0.9, -1.4, 0.02, 0.03);</b>
732*******************************************************************************
733*******************************************************************************
734*******************************************************************************
735**********+++++++++++++++++++++************************************************
736*+++++++++++++++++++++++++++++++++++++++***************************************
737+++++++++++++++++++++++++++++++++++++++++++++**********************************
738++++++++++++++++++++++++++++++++++++++++++++++++++*****************************
739++++++++++++++++++++++++++++++++++++++++++++++++++++++*************************
740+++++++++++++++++++++++++++++++++++++++++++++++++++++++++**********************
741+++++++++++++++++++++++++++++++++.........++++++++++++++++++*******************
742+++++++++++++++++++++++++++++++.... ......+++++++++++++++++++****************
743+++++++++++++++++++++++++++++....... ........+++++++++++++++++++**************
744++++++++++++++++++++++++++++........ ........++++++++++++++++++++************
745+++++++++++++++++++++++++++......... .. ...+++++++++++++++++++++**********
746++++++++++++++++++++++++++........... ....++++++++++++++++++++++********
747++++++++++++++++++++++++............. .......++++++++++++++++++++++******
748+++++++++++++++++++++++............. ........+++++++++++++++++++++++****
749++++++++++++++++++++++........... ..........++++++++++++++++++++++***
750++++++++++++++++++++........... .........++++++++++++++++++++++*
751++++++++++++++++++............ ...........++++++++++++++++++++
752++++++++++++++++............... .............++++++++++++++++++
753++++++++++++++................. ...............++++++++++++++++
754++++++++++++.................. .................++++++++++++++
755+++++++++.................. .................+++++++++++++
756++++++........ . ......... ..++++++++++++
757++............ ...... ....++++++++++
758.............. ...++++++++++
759.............. ....+++++++++
760.............. .....++++++++
761............. ......++++++++
762........... .......++++++++
763......... ........+++++++
764......... ........+++++++
765......... ....+++++++
766........ ...+++++++
767....... ...+++++++
768 ....+++++++
769 .....+++++++
770 ....+++++++
771 ....+++++++
772 ....+++++++
773Evaluated to 0.000000
774ready&gt; <b>^D</b>
775</pre>
776</div>
777
778<p>At this point, you may be starting to realize that Kaleidoscope is a real
779and powerful language. It may not be self-similar :), but it can be used to
780plot things that are!</p>
781
782<p>With this, we conclude the "adding user-defined operators" chapter of the
783tutorial. We successfully extended our language with the ability to extend the
784language in the library, and showed how this can be used to build a simple but
785interesting end user application in Kaleidoscope. At this point, Kaleidoscope
786can build a variety of applications that are functional and can call functions
Chris Lattnerb5d81b32007-11-02 05:54:25 +0000787with side-effects, but it can't actually define and mutate a variable itself.
Chris Lattner58f2c872007-11-02 05:42:52 +0000788</p>
789
790<p>Strikingly, lack of this feature is an important limitation for some
791languages, and it is not at all obvious how to <a href="LangImpl7.html">add
792support for mutable variables</a> without having to add an "SSA construction"
793phase to your front-end. In the next chapter, we will describe how you can
794add this without building SSA in your front-end.</p>
Chris Lattnerc9d5d2c2007-11-01 06:49:54 +0000795
796</div>
797
798
799<!-- *********************************************************************** -->
800<div class="doc_section"><a name="code">Full Code Listing</a></div>
801<!-- *********************************************************************** -->
802
803<div class="doc_text">
804
805<p>
806Here is the complete code listing for our running example, enhanced with the
807if/then/else and for expressions.. To build this example, use:
808</p>
809
810<div class="doc_code">
811<pre>
812 # Compile
813 g++ -g toy.cpp `llvm-config --cppflags --ldflags --libs core jit native` -O3 -o toy
814 # Run
815 ./toy
816</pre>
817</div>
818
819<p>Here is the code:</p>
820
821<div class="doc_code">
822<pre>
Chris Lattner58f2c872007-11-02 05:42:52 +0000823#include "llvm/DerivedTypes.h"
824#include "llvm/ExecutionEngine/ExecutionEngine.h"
825#include "llvm/Module.h"
826#include "llvm/ModuleProvider.h"
827#include "llvm/PassManager.h"
828#include "llvm/Analysis/Verifier.h"
829#include "llvm/Target/TargetData.h"
830#include "llvm/Transforms/Scalar.h"
831#include "llvm/Support/LLVMBuilder.h"
832#include &lt;cstdio&gt;
833#include &lt;string&gt;
834#include &lt;map&gt;
835#include &lt;vector&gt;
836using namespace llvm;
837
838//===----------------------------------------------------------------------===//
839// Lexer
840//===----------------------------------------------------------------------===//
841
842// The lexer returns tokens [0-255] if it is an unknown character, otherwise one
843// of these for known things.
844enum Token {
845 tok_eof = -1,
846
847 // commands
848 tok_def = -2, tok_extern = -3,
849
850 // primary
851 tok_identifier = -4, tok_number = -5,
852
853 // control
854 tok_if = -6, tok_then = -7, tok_else = -8,
855 tok_for = -9, tok_in = -10,
856
857 // operators
858 tok_binary = -11, tok_unary = -12
859};
860
861static std::string IdentifierStr; // Filled in if tok_identifier
862static double NumVal; // Filled in if tok_number
863
864/// gettok - Return the next token from standard input.
865static int gettok() {
866 static int LastChar = ' ';
867
868 // Skip any whitespace.
869 while (isspace(LastChar))
870 LastChar = getchar();
871
872 if (isalpha(LastChar)) { // identifier: [a-zA-Z][a-zA-Z0-9]*
873 IdentifierStr = LastChar;
874 while (isalnum((LastChar = getchar())))
875 IdentifierStr += LastChar;
876
877 if (IdentifierStr == "def") return tok_def;
878 if (IdentifierStr == "extern") return tok_extern;
879 if (IdentifierStr == "if") return tok_if;
880 if (IdentifierStr == "then") return tok_then;
881 if (IdentifierStr == "else") return tok_else;
882 if (IdentifierStr == "for") return tok_for;
883 if (IdentifierStr == "in") return tok_in;
884 if (IdentifierStr == "binary") return tok_binary;
885 if (IdentifierStr == "unary") return tok_unary;
886 return tok_identifier;
887 }
888
889 if (isdigit(LastChar) || LastChar == '.') { // Number: [0-9.]+
890 std::string NumStr;
891 do {
892 NumStr += LastChar;
893 LastChar = getchar();
894 } while (isdigit(LastChar) || LastChar == '.');
895
896 NumVal = strtod(NumStr.c_str(), 0);
897 return tok_number;
898 }
899
900 if (LastChar == '#') {
901 // Comment until end of line.
902 do LastChar = getchar();
903 while (LastChar != EOF &amp;&amp; LastChar != '\n' &amp; LastChar != '\r');
904
905 if (LastChar != EOF)
906 return gettok();
907 }
908
909 // Check for end of file. Don't eat the EOF.
910 if (LastChar == EOF)
911 return tok_eof;
912
913 // Otherwise, just return the character as its ascii value.
914 int ThisChar = LastChar;
915 LastChar = getchar();
916 return ThisChar;
917}
918
919//===----------------------------------------------------------------------===//
920// Abstract Syntax Tree (aka Parse Tree)
921//===----------------------------------------------------------------------===//
922
923/// ExprAST - Base class for all expression nodes.
924class ExprAST {
925public:
926 virtual ~ExprAST() {}
927 virtual Value *Codegen() = 0;
928};
929
930/// NumberExprAST - Expression class for numeric literals like "1.0".
931class NumberExprAST : public ExprAST {
932 double Val;
933public:
934 NumberExprAST(double val) : Val(val) {}
935 virtual Value *Codegen();
936};
937
938/// VariableExprAST - Expression class for referencing a variable, like "a".
939class VariableExprAST : public ExprAST {
940 std::string Name;
941public:
942 VariableExprAST(const std::string &amp;name) : Name(name) {}
943 virtual Value *Codegen();
944};
945
946/// UnaryExprAST - Expression class for a unary operator.
947class UnaryExprAST : public ExprAST {
948 char Opcode;
949 ExprAST *Operand;
950public:
951 UnaryExprAST(char opcode, ExprAST *operand)
952 : Opcode(opcode), Operand(operand) {}
953 virtual Value *Codegen();
954};
955
956/// BinaryExprAST - Expression class for a binary operator.
957class BinaryExprAST : public ExprAST {
958 char Op;
959 ExprAST *LHS, *RHS;
960public:
961 BinaryExprAST(char op, ExprAST *lhs, ExprAST *rhs)
962 : Op(op), LHS(lhs), RHS(rhs) {}
963 virtual Value *Codegen();
964};
965
966/// CallExprAST - Expression class for function calls.
967class CallExprAST : public ExprAST {
968 std::string Callee;
969 std::vector&lt;ExprAST*&gt; Args;
970public:
971 CallExprAST(const std::string &amp;callee, std::vector&lt;ExprAST*&gt; &amp;args)
972 : Callee(callee), Args(args) {}
973 virtual Value *Codegen();
974};
975
976/// IfExprAST - Expression class for if/then/else.
977class IfExprAST : public ExprAST {
978 ExprAST *Cond, *Then, *Else;
979public:
980 IfExprAST(ExprAST *cond, ExprAST *then, ExprAST *_else)
981 : Cond(cond), Then(then), Else(_else) {}
982 virtual Value *Codegen();
983};
984
985/// ForExprAST - Expression class for for/in.
986class ForExprAST : public ExprAST {
987 std::string VarName;
988 ExprAST *Start, *End, *Step, *Body;
989public:
990 ForExprAST(const std::string &amp;varname, ExprAST *start, ExprAST *end,
991 ExprAST *step, ExprAST *body)
992 : VarName(varname), Start(start), End(end), Step(step), Body(body) {}
993 virtual Value *Codegen();
994};
995
996/// PrototypeAST - This class represents the "prototype" for a function,
997/// which captures its argument names as well as if it is an operator.
998class PrototypeAST {
999 std::string Name;
1000 std::vector&lt;std::string&gt; Args;
1001 bool isOperator;
1002 unsigned Precedence; // Precedence if a binary op.
1003public:
1004 PrototypeAST(const std::string &amp;name, const std::vector&lt;std::string&gt; &amp;args,
1005 bool isoperator = false, unsigned prec = 0)
1006 : Name(name), Args(args), isOperator(isoperator), Precedence(prec) {}
1007
1008 bool isUnaryOp() const { return isOperator &amp;&amp; Args.size() == 1; }
1009 bool isBinaryOp() const { return isOperator &amp;&amp; Args.size() == 2; }
1010
1011 char getOperatorName() const {
1012 assert(isUnaryOp() || isBinaryOp());
1013 return Name[Name.size()-1];
1014 }
1015
1016 unsigned getBinaryPrecedence() const { return Precedence; }
1017
1018 Function *Codegen();
1019};
1020
1021/// FunctionAST - This class represents a function definition itself.
1022class FunctionAST {
1023 PrototypeAST *Proto;
1024 ExprAST *Body;
1025public:
1026 FunctionAST(PrototypeAST *proto, ExprAST *body)
1027 : Proto(proto), Body(body) {}
1028
1029 Function *Codegen();
1030};
1031
1032//===----------------------------------------------------------------------===//
1033// Parser
1034//===----------------------------------------------------------------------===//
1035
1036/// CurTok/getNextToken - Provide a simple token buffer. CurTok is the current
1037/// token the parser it looking at. getNextToken reads another token from the
1038/// lexer and updates CurTok with its results.
1039static int CurTok;
1040static int getNextToken() {
1041 return CurTok = gettok();
1042}
1043
1044/// BinopPrecedence - This holds the precedence for each binary operator that is
1045/// defined.
1046static std::map&lt;char, int&gt; BinopPrecedence;
1047
1048/// GetTokPrecedence - Get the precedence of the pending binary operator token.
1049static int GetTokPrecedence() {
1050 if (!isascii(CurTok))
1051 return -1;
1052
1053 // Make sure it's a declared binop.
1054 int TokPrec = BinopPrecedence[CurTok];
1055 if (TokPrec &lt;= 0) return -1;
1056 return TokPrec;
1057}
1058
1059/// Error* - These are little helper functions for error handling.
1060ExprAST *Error(const char *Str) { fprintf(stderr, "Error: %s\n", Str);return 0;}
1061PrototypeAST *ErrorP(const char *Str) { Error(Str); return 0; }
1062FunctionAST *ErrorF(const char *Str) { Error(Str); return 0; }
1063
1064static ExprAST *ParseExpression();
1065
1066/// identifierexpr
Chris Lattner20a0c802007-11-05 17:54:34 +00001067/// ::= identifier
1068/// ::= identifier '(' expression* ')'
Chris Lattner58f2c872007-11-02 05:42:52 +00001069static ExprAST *ParseIdentifierExpr() {
1070 std::string IdName = IdentifierStr;
1071
Chris Lattner20a0c802007-11-05 17:54:34 +00001072 getNextToken(); // eat identifier.
Chris Lattner58f2c872007-11-02 05:42:52 +00001073
1074 if (CurTok != '(') // Simple variable ref.
1075 return new VariableExprAST(IdName);
1076
1077 // Call.
1078 getNextToken(); // eat (
1079 std::vector&lt;ExprAST*&gt; Args;
1080 if (CurTok != ')') {
1081 while (1) {
1082 ExprAST *Arg = ParseExpression();
1083 if (!Arg) return 0;
1084 Args.push_back(Arg);
1085
1086 if (CurTok == ')') break;
1087
1088 if (CurTok != ',')
1089 return Error("Expected ')'");
1090 getNextToken();
1091 }
1092 }
1093
1094 // Eat the ')'.
1095 getNextToken();
1096
1097 return new CallExprAST(IdName, Args);
1098}
1099
1100/// numberexpr ::= number
1101static ExprAST *ParseNumberExpr() {
1102 ExprAST *Result = new NumberExprAST(NumVal);
1103 getNextToken(); // consume the number
1104 return Result;
1105}
1106
1107/// parenexpr ::= '(' expression ')'
1108static ExprAST *ParseParenExpr() {
1109 getNextToken(); // eat (.
1110 ExprAST *V = ParseExpression();
1111 if (!V) return 0;
1112
1113 if (CurTok != ')')
1114 return Error("expected ')'");
1115 getNextToken(); // eat ).
1116 return V;
1117}
1118
1119/// ifexpr ::= 'if' expression 'then' expression 'else' expression
1120static ExprAST *ParseIfExpr() {
1121 getNextToken(); // eat the if.
1122
1123 // condition.
1124 ExprAST *Cond = ParseExpression();
1125 if (!Cond) return 0;
1126
1127 if (CurTok != tok_then)
1128 return Error("expected then");
1129 getNextToken(); // eat the then
1130
1131 ExprAST *Then = ParseExpression();
1132 if (Then == 0) return 0;
1133
1134 if (CurTok != tok_else)
1135 return Error("expected else");
1136
1137 getNextToken();
1138
1139 ExprAST *Else = ParseExpression();
1140 if (!Else) return 0;
1141
1142 return new IfExprAST(Cond, Then, Else);
1143}
1144
Chris Lattner20a0c802007-11-05 17:54:34 +00001145/// forexpr ::= 'for' identifier '=' expr ',' expr (',' expr)? 'in' expression
Chris Lattner58f2c872007-11-02 05:42:52 +00001146static ExprAST *ParseForExpr() {
1147 getNextToken(); // eat the for.
1148
1149 if (CurTok != tok_identifier)
1150 return Error("expected identifier after for");
1151
1152 std::string IdName = IdentifierStr;
Chris Lattner20a0c802007-11-05 17:54:34 +00001153 getNextToken(); // eat identifier.
Chris Lattner58f2c872007-11-02 05:42:52 +00001154
1155 if (CurTok != '=')
1156 return Error("expected '=' after for");
1157 getNextToken(); // eat '='.
1158
1159
1160 ExprAST *Start = ParseExpression();
1161 if (Start == 0) return 0;
1162 if (CurTok != ',')
1163 return Error("expected ',' after for start value");
1164 getNextToken();
1165
1166 ExprAST *End = ParseExpression();
1167 if (End == 0) return 0;
1168
1169 // The step value is optional.
1170 ExprAST *Step = 0;
1171 if (CurTok == ',') {
1172 getNextToken();
1173 Step = ParseExpression();
1174 if (Step == 0) return 0;
1175 }
1176
1177 if (CurTok != tok_in)
1178 return Error("expected 'in' after for");
1179 getNextToken(); // eat 'in'.
1180
1181 ExprAST *Body = ParseExpression();
1182 if (Body == 0) return 0;
1183
1184 return new ForExprAST(IdName, Start, End, Step, Body);
1185}
1186
1187
1188/// primary
1189/// ::= identifierexpr
1190/// ::= numberexpr
1191/// ::= parenexpr
1192/// ::= ifexpr
1193/// ::= forexpr
1194static ExprAST *ParsePrimary() {
1195 switch (CurTok) {
1196 default: return Error("unknown token when expecting an expression");
1197 case tok_identifier: return ParseIdentifierExpr();
1198 case tok_number: return ParseNumberExpr();
1199 case '(': return ParseParenExpr();
1200 case tok_if: return ParseIfExpr();
1201 case tok_for: return ParseForExpr();
1202 }
1203}
1204
1205/// unary
1206/// ::= primary
1207/// ::= '!' unary
1208static ExprAST *ParseUnary() {
1209 // If the current token is not an operator, it must be a primary expr.
1210 if (!isascii(CurTok) || CurTok == '(' || CurTok == ',')
1211 return ParsePrimary();
1212
1213 // If this is a unary operator, read it.
1214 int Opc = CurTok;
1215 getNextToken();
1216 if (ExprAST *Operand = ParseUnary())
1217 return new UnaryExprAST(Opc, Operand);
1218 return 0;
1219}
1220
1221/// binoprhs
1222/// ::= ('+' unary)*
1223static ExprAST *ParseBinOpRHS(int ExprPrec, ExprAST *LHS) {
1224 // If this is a binop, find its precedence.
1225 while (1) {
1226 int TokPrec = GetTokPrecedence();
1227
1228 // If this is a binop that binds at least as tightly as the current binop,
1229 // consume it, otherwise we are done.
1230 if (TokPrec &lt; ExprPrec)
1231 return LHS;
1232
1233 // Okay, we know this is a binop.
1234 int BinOp = CurTok;
1235 getNextToken(); // eat binop
1236
1237 // Parse the unary expression after the binary operator.
1238 ExprAST *RHS = ParseUnary();
1239 if (!RHS) return 0;
1240
1241 // If BinOp binds less tightly with RHS than the operator after RHS, let
1242 // the pending operator take RHS as its LHS.
1243 int NextPrec = GetTokPrecedence();
1244 if (TokPrec &lt; NextPrec) {
1245 RHS = ParseBinOpRHS(TokPrec+1, RHS);
1246 if (RHS == 0) return 0;
1247 }
1248
1249 // Merge LHS/RHS.
1250 LHS = new BinaryExprAST(BinOp, LHS, RHS);
1251 }
1252}
1253
1254/// expression
1255/// ::= unary binoprhs
1256///
1257static ExprAST *ParseExpression() {
1258 ExprAST *LHS = ParseUnary();
1259 if (!LHS) return 0;
1260
1261 return ParseBinOpRHS(0, LHS);
1262}
1263
1264/// prototype
1265/// ::= id '(' id* ')'
1266/// ::= binary LETTER number? (id, id)
1267/// ::= unary LETTER (id)
1268static PrototypeAST *ParsePrototype() {
1269 std::string FnName;
1270
1271 int Kind = 0; // 0 = identifier, 1 = unary, 2 = binary.
1272 unsigned BinaryPrecedence = 30;
1273
1274 switch (CurTok) {
1275 default:
1276 return ErrorP("Expected function name in prototype");
1277 case tok_identifier:
1278 FnName = IdentifierStr;
1279 Kind = 0;
1280 getNextToken();
1281 break;
1282 case tok_unary:
1283 getNextToken();
1284 if (!isascii(CurTok))
1285 return ErrorP("Expected unary operator");
1286 FnName = "unary";
1287 FnName += (char)CurTok;
1288 Kind = 1;
1289 getNextToken();
1290 break;
1291 case tok_binary:
1292 getNextToken();
1293 if (!isascii(CurTok))
1294 return ErrorP("Expected binary operator");
1295 FnName = "binary";
1296 FnName += (char)CurTok;
1297 Kind = 2;
1298 getNextToken();
1299
1300 // Read the precedence if present.
1301 if (CurTok == tok_number) {
1302 if (NumVal &lt; 1 || NumVal &gt; 100)
1303 return ErrorP("Invalid precedecnce: must be 1..100");
1304 BinaryPrecedence = (unsigned)NumVal;
1305 getNextToken();
1306 }
1307 break;
1308 }
1309
1310 if (CurTok != '(')
1311 return ErrorP("Expected '(' in prototype");
1312
1313 std::vector&lt;std::string&gt; ArgNames;
1314 while (getNextToken() == tok_identifier)
1315 ArgNames.push_back(IdentifierStr);
1316 if (CurTok != ')')
1317 return ErrorP("Expected ')' in prototype");
1318
1319 // success.
1320 getNextToken(); // eat ')'.
1321
1322 // Verify right number of names for operator.
1323 if (Kind &amp;&amp; ArgNames.size() != Kind)
1324 return ErrorP("Invalid number of operands for operator");
1325
1326 return new PrototypeAST(FnName, ArgNames, Kind != 0, BinaryPrecedence);
1327}
1328
1329/// definition ::= 'def' prototype expression
1330static FunctionAST *ParseDefinition() {
1331 getNextToken(); // eat def.
1332 PrototypeAST *Proto = ParsePrototype();
1333 if (Proto == 0) return 0;
1334
1335 if (ExprAST *E = ParseExpression())
1336 return new FunctionAST(Proto, E);
1337 return 0;
1338}
1339
1340/// toplevelexpr ::= expression
1341static FunctionAST *ParseTopLevelExpr() {
1342 if (ExprAST *E = ParseExpression()) {
1343 // Make an anonymous proto.
1344 PrototypeAST *Proto = new PrototypeAST("", std::vector&lt;std::string&gt;());
1345 return new FunctionAST(Proto, E);
1346 }
1347 return 0;
1348}
1349
1350/// external ::= 'extern' prototype
1351static PrototypeAST *ParseExtern() {
1352 getNextToken(); // eat extern.
1353 return ParsePrototype();
1354}
1355
1356//===----------------------------------------------------------------------===//
1357// Code Generation
1358//===----------------------------------------------------------------------===//
1359
1360static Module *TheModule;
1361static LLVMFoldingBuilder Builder;
1362static std::map&lt;std::string, Value*&gt; NamedValues;
1363static FunctionPassManager *TheFPM;
1364
1365Value *ErrorV(const char *Str) { Error(Str); return 0; }
1366
1367Value *NumberExprAST::Codegen() {
1368 return ConstantFP::get(Type::DoubleTy, APFloat(Val));
1369}
1370
1371Value *VariableExprAST::Codegen() {
1372 // Look this variable up in the function.
1373 Value *V = NamedValues[Name];
1374 return V ? V : ErrorV("Unknown variable name");
1375}
1376
1377Value *UnaryExprAST::Codegen() {
1378 Value *OperandV = Operand-&gt;Codegen();
1379 if (OperandV == 0) return 0;
1380
1381 Function *F = TheModule-&gt;getFunction(std::string("unary")+Opcode);
1382 if (F == 0)
1383 return ErrorV("Unknown unary operator");
1384
1385 return Builder.CreateCall(F, OperandV, "unop");
1386}
1387
1388
1389Value *BinaryExprAST::Codegen() {
1390 Value *L = LHS-&gt;Codegen();
1391 Value *R = RHS-&gt;Codegen();
1392 if (L == 0 || R == 0) return 0;
1393
1394 switch (Op) {
1395 case '+': return Builder.CreateAdd(L, R, "addtmp");
1396 case '-': return Builder.CreateSub(L, R, "subtmp");
1397 case '*': return Builder.CreateMul(L, R, "multmp");
1398 case '&lt;':
Chris Lattner71155212007-11-06 01:39:12 +00001399 L = Builder.CreateFCmpULT(L, R, "cmptmp");
Chris Lattner58f2c872007-11-02 05:42:52 +00001400 // Convert bool 0/1 to double 0.0 or 1.0
1401 return Builder.CreateUIToFP(L, Type::DoubleTy, "booltmp");
1402 default: break;
1403 }
1404
1405 // If it wasn't a builtin binary operator, it must be a user defined one. Emit
1406 // a call to it.
1407 Function *F = TheModule-&gt;getFunction(std::string("binary")+Op);
1408 assert(F &amp;&amp; "binary operator not found!");
1409
1410 Value *Ops[] = { L, R };
1411 return Builder.CreateCall(F, Ops, Ops+2, "binop");
1412}
1413
1414Value *CallExprAST::Codegen() {
1415 // Look up the name in the global module table.
1416 Function *CalleeF = TheModule-&gt;getFunction(Callee);
1417 if (CalleeF == 0)
1418 return ErrorV("Unknown function referenced");
1419
1420 // If argument mismatch error.
1421 if (CalleeF-&gt;arg_size() != Args.size())
1422 return ErrorV("Incorrect # arguments passed");
1423
1424 std::vector&lt;Value*&gt; ArgsV;
1425 for (unsigned i = 0, e = Args.size(); i != e; ++i) {
1426 ArgsV.push_back(Args[i]-&gt;Codegen());
1427 if (ArgsV.back() == 0) return 0;
1428 }
1429
1430 return Builder.CreateCall(CalleeF, ArgsV.begin(), ArgsV.end(), "calltmp");
1431}
1432
1433Value *IfExprAST::Codegen() {
1434 Value *CondV = Cond-&gt;Codegen();
1435 if (CondV == 0) return 0;
1436
1437 // Convert condition to a bool by comparing equal to 0.0.
1438 CondV = Builder.CreateFCmpONE(CondV,
1439 ConstantFP::get(Type::DoubleTy, APFloat(0.0)),
1440 "ifcond");
1441
1442 Function *TheFunction = Builder.GetInsertBlock()-&gt;getParent();
1443
1444 // Create blocks for the then and else cases. Insert the 'then' block at the
1445 // end of the function.
1446 BasicBlock *ThenBB = new BasicBlock("then", TheFunction);
1447 BasicBlock *ElseBB = new BasicBlock("else");
1448 BasicBlock *MergeBB = new BasicBlock("ifcont");
1449
1450 Builder.CreateCondBr(CondV, ThenBB, ElseBB);
1451
1452 // Emit then value.
1453 Builder.SetInsertPoint(ThenBB);
1454
1455 Value *ThenV = Then-&gt;Codegen();
1456 if (ThenV == 0) return 0;
1457
1458 Builder.CreateBr(MergeBB);
1459 // Codegen of 'Then' can change the current block, update ThenBB for the PHI.
1460 ThenBB = Builder.GetInsertBlock();
1461
1462 // Emit else block.
1463 TheFunction-&gt;getBasicBlockList().push_back(ElseBB);
1464 Builder.SetInsertPoint(ElseBB);
1465
1466 Value *ElseV = Else-&gt;Codegen();
1467 if (ElseV == 0) return 0;
1468
1469 Builder.CreateBr(MergeBB);
1470 // Codegen of 'Else' can change the current block, update ElseBB for the PHI.
1471 ElseBB = Builder.GetInsertBlock();
1472
1473 // Emit merge block.
1474 TheFunction-&gt;getBasicBlockList().push_back(MergeBB);
1475 Builder.SetInsertPoint(MergeBB);
1476 PHINode *PN = Builder.CreatePHI(Type::DoubleTy, "iftmp");
1477
1478 PN-&gt;addIncoming(ThenV, ThenBB);
1479 PN-&gt;addIncoming(ElseV, ElseBB);
1480 return PN;
1481}
1482
1483Value *ForExprAST::Codegen() {
1484 // Output this as:
1485 // ...
1486 // start = startexpr
1487 // goto loop
1488 // loop:
1489 // variable = phi [start, loopheader], [nextvariable, loopend]
1490 // ...
1491 // bodyexpr
1492 // ...
1493 // loopend:
1494 // step = stepexpr
1495 // nextvariable = variable + step
1496 // endcond = endexpr
1497 // br endcond, loop, endloop
1498 // outloop:
1499
1500 // Emit the start code first, without 'variable' in scope.
1501 Value *StartVal = Start-&gt;Codegen();
1502 if (StartVal == 0) return 0;
1503
1504 // Make the new basic block for the loop header, inserting after current
1505 // block.
1506 Function *TheFunction = Builder.GetInsertBlock()-&gt;getParent();
1507 BasicBlock *PreheaderBB = Builder.GetInsertBlock();
1508 BasicBlock *LoopBB = new BasicBlock("loop", TheFunction);
1509
1510 // Insert an explicit fall through from the current block to the LoopBB.
1511 Builder.CreateBr(LoopBB);
1512
1513 // Start insertion in LoopBB.
1514 Builder.SetInsertPoint(LoopBB);
1515
1516 // Start the PHI node with an entry for Start.
1517 PHINode *Variable = Builder.CreatePHI(Type::DoubleTy, VarName.c_str());
1518 Variable-&gt;addIncoming(StartVal, PreheaderBB);
1519
1520 // Within the loop, the variable is defined equal to the PHI node. If it
1521 // shadows an existing variable, we have to restore it, so save it now.
1522 Value *OldVal = NamedValues[VarName];
1523 NamedValues[VarName] = Variable;
1524
1525 // Emit the body of the loop. This, like any other expr, can change the
1526 // current BB. Note that we ignore the value computed by the body, but don't
1527 // allow an error.
1528 if (Body-&gt;Codegen() == 0)
1529 return 0;
1530
1531 // Emit the step value.
1532 Value *StepVal;
1533 if (Step) {
1534 StepVal = Step-&gt;Codegen();
1535 if (StepVal == 0) return 0;
1536 } else {
1537 // If not specified, use 1.0.
1538 StepVal = ConstantFP::get(Type::DoubleTy, APFloat(1.0));
1539 }
1540
1541 Value *NextVar = Builder.CreateAdd(Variable, StepVal, "nextvar");
1542
1543 // Compute the end condition.
1544 Value *EndCond = End-&gt;Codegen();
1545 if (EndCond == 0) return EndCond;
1546
1547 // Convert condition to a bool by comparing equal to 0.0.
1548 EndCond = Builder.CreateFCmpONE(EndCond,
1549 ConstantFP::get(Type::DoubleTy, APFloat(0.0)),
1550 "loopcond");
1551
1552 // Create the "after loop" block and insert it.
1553 BasicBlock *LoopEndBB = Builder.GetInsertBlock();
1554 BasicBlock *AfterBB = new BasicBlock("afterloop", TheFunction);
1555
1556 // Insert the conditional branch into the end of LoopEndBB.
1557 Builder.CreateCondBr(EndCond, LoopBB, AfterBB);
1558
1559 // Any new code will be inserted in AfterBB.
1560 Builder.SetInsertPoint(AfterBB);
1561
1562 // Add a new entry to the PHI node for the backedge.
1563 Variable-&gt;addIncoming(NextVar, LoopEndBB);
1564
1565 // Restore the unshadowed variable.
1566 if (OldVal)
1567 NamedValues[VarName] = OldVal;
1568 else
1569 NamedValues.erase(VarName);
1570
1571
1572 // for expr always returns 0.0.
1573 return Constant::getNullValue(Type::DoubleTy);
1574}
1575
1576Function *PrototypeAST::Codegen() {
1577 // Make the function type: double(double,double) etc.
1578 std::vector&lt;const Type*&gt; Doubles(Args.size(), Type::DoubleTy);
1579 FunctionType *FT = FunctionType::get(Type::DoubleTy, Doubles, false);
1580
1581 Function *F = new Function(FT, Function::ExternalLinkage, Name, TheModule);
1582
1583 // If F conflicted, there was already something named 'Name'. If it has a
1584 // body, don't allow redefinition or reextern.
1585 if (F-&gt;getName() != Name) {
1586 // Delete the one we just made and get the existing one.
1587 F-&gt;eraseFromParent();
1588 F = TheModule-&gt;getFunction(Name);
1589
1590 // If F already has a body, reject this.
1591 if (!F-&gt;empty()) {
1592 ErrorF("redefinition of function");
1593 return 0;
1594 }
1595
1596 // If F took a different number of args, reject.
1597 if (F-&gt;arg_size() != Args.size()) {
1598 ErrorF("redefinition of function with different # args");
1599 return 0;
1600 }
1601 }
1602
1603 // Set names for all arguments.
1604 unsigned Idx = 0;
1605 for (Function::arg_iterator AI = F-&gt;arg_begin(); Idx != Args.size();
1606 ++AI, ++Idx) {
1607 AI-&gt;setName(Args[Idx]);
1608
1609 // Add arguments to variable symbol table.
1610 NamedValues[Args[Idx]] = AI;
1611 }
1612
1613 return F;
1614}
1615
1616Function *FunctionAST::Codegen() {
1617 NamedValues.clear();
1618
1619 Function *TheFunction = Proto-&gt;Codegen();
1620 if (TheFunction == 0)
1621 return 0;
1622
1623 // If this is an operator, install it.
1624 if (Proto-&gt;isBinaryOp())
1625 BinopPrecedence[Proto-&gt;getOperatorName()] = Proto-&gt;getBinaryPrecedence();
1626
1627 // Create a new basic block to start insertion into.
1628 BasicBlock *BB = new BasicBlock("entry", TheFunction);
1629 Builder.SetInsertPoint(BB);
1630
1631 if (Value *RetVal = Body-&gt;Codegen()) {
1632 // Finish off the function.
1633 Builder.CreateRet(RetVal);
1634
1635 // Validate the generated code, checking for consistency.
1636 verifyFunction(*TheFunction);
1637
1638 // Optimize the function.
1639 TheFPM-&gt;run(*TheFunction);
1640
1641 return TheFunction;
1642 }
1643
1644 // Error reading body, remove function.
1645 TheFunction-&gt;eraseFromParent();
1646
1647 if (Proto-&gt;isBinaryOp())
1648 BinopPrecedence.erase(Proto-&gt;getOperatorName());
1649 return 0;
1650}
1651
1652//===----------------------------------------------------------------------===//
1653// Top-Level parsing and JIT Driver
1654//===----------------------------------------------------------------------===//
1655
1656static ExecutionEngine *TheExecutionEngine;
1657
1658static void HandleDefinition() {
1659 if (FunctionAST *F = ParseDefinition()) {
1660 if (Function *LF = F-&gt;Codegen()) {
1661 fprintf(stderr, "Read function definition:");
1662 LF-&gt;dump();
1663 }
1664 } else {
1665 // Skip token for error recovery.
1666 getNextToken();
1667 }
1668}
1669
1670static void HandleExtern() {
1671 if (PrototypeAST *P = ParseExtern()) {
1672 if (Function *F = P-&gt;Codegen()) {
1673 fprintf(stderr, "Read extern: ");
1674 F-&gt;dump();
1675 }
1676 } else {
1677 // Skip token for error recovery.
1678 getNextToken();
1679 }
1680}
1681
1682static void HandleTopLevelExpression() {
1683 // Evaluate a top level expression into an anonymous function.
1684 if (FunctionAST *F = ParseTopLevelExpr()) {
1685 if (Function *LF = F-&gt;Codegen()) {
1686 // JIT the function, returning a function pointer.
1687 void *FPtr = TheExecutionEngine-&gt;getPointerToFunction(LF);
1688
1689 // Cast it to the right type (takes no arguments, returns a double) so we
1690 // can call it as a native function.
1691 double (*FP)() = (double (*)())FPtr;
1692 fprintf(stderr, "Evaluated to %f\n", FP());
1693 }
1694 } else {
1695 // Skip token for error recovery.
1696 getNextToken();
1697 }
1698}
1699
1700/// top ::= definition | external | expression | ';'
1701static void MainLoop() {
1702 while (1) {
1703 fprintf(stderr, "ready&gt; ");
1704 switch (CurTok) {
1705 case tok_eof: return;
1706 case ';': getNextToken(); break; // ignore top level semicolons.
1707 case tok_def: HandleDefinition(); break;
1708 case tok_extern: HandleExtern(); break;
1709 default: HandleTopLevelExpression(); break;
1710 }
1711 }
1712}
1713
1714
1715
1716//===----------------------------------------------------------------------===//
1717// "Library" functions that can be "extern'd" from user code.
1718//===----------------------------------------------------------------------===//
1719
1720/// putchard - putchar that takes a double and returns 0.
1721extern "C"
1722double putchard(double X) {
1723 putchar((char)X);
1724 return 0;
1725}
1726
1727/// printd - printf that takes a double prints it as "%f\n", returning 0.
1728extern "C"
1729double printd(double X) {
1730 printf("%f\n", X);
1731 return 0;
1732}
1733
1734//===----------------------------------------------------------------------===//
1735// Main driver code.
1736//===----------------------------------------------------------------------===//
1737
1738int main() {
1739 // Install standard binary operators.
1740 // 1 is lowest precedence.
1741 BinopPrecedence['&lt;'] = 10;
1742 BinopPrecedence['+'] = 20;
1743 BinopPrecedence['-'] = 20;
1744 BinopPrecedence['*'] = 40; // highest.
1745
1746 // Prime the first token.
1747 fprintf(stderr, "ready&gt; ");
1748 getNextToken();
1749
1750 // Make the module, which holds all the code.
1751 TheModule = new Module("my cool jit");
1752
1753 // Create the JIT.
1754 TheExecutionEngine = ExecutionEngine::create(TheModule);
1755
1756 {
1757 ExistingModuleProvider OurModuleProvider(TheModule);
1758 FunctionPassManager OurFPM(&amp;OurModuleProvider);
1759
1760 // Set up the optimizer pipeline. Start with registering info about how the
1761 // target lays out data structures.
1762 OurFPM.add(new TargetData(*TheExecutionEngine-&gt;getTargetData()));
1763 // Do simple "peephole" optimizations and bit-twiddling optzns.
1764 OurFPM.add(createInstructionCombiningPass());
1765 // Reassociate expressions.
1766 OurFPM.add(createReassociatePass());
1767 // Eliminate Common SubExpressions.
1768 OurFPM.add(createGVNPass());
1769 // Simplify the control flow graph (deleting unreachable blocks, etc).
1770 OurFPM.add(createCFGSimplificationPass());
1771 // Set the global so the code gen can use this.
1772 TheFPM = &amp;OurFPM;
1773
1774 // Run the main "interpreter loop" now.
1775 MainLoop();
1776
1777 TheFPM = 0;
1778 } // Free module provider and pass manager.
1779
1780
1781 // Print out all of the generated code.
1782 TheModule->dump();
1783 return 0;
1784}
Chris Lattnerc9d5d2c2007-11-01 06:49:54 +00001785</pre>
1786</div>
1787
1788</div>
1789
1790<!-- *********************************************************************** -->
1791<hr>
1792<address>
1793 <a href="http://jigsaw.w3.org/css-validator/check/referer"><img
1794 src="http://jigsaw.w3.org/css-validator/images/vcss" alt="Valid CSS!"></a>
1795 <a href="http://validator.w3.org/check/referer"><img
1796 src="http://www.w3.org/Icons/valid-html401" alt="Valid HTML 4.01!"></a>
1797
1798 <a href="mailto:sabre@nondot.org">Chris Lattner</a><br>
1799 <a href="http://llvm.org">The LLVM Compiler Infrastructure</a><br>
1800 Last modified: $Date: 2007-10-17 11:05:13 -0700 (Wed, 17 Oct 2007) $
1801</address>
1802</body>
1803</html>