blob: 33f4adbdebb016df1da2745bb538a48852a2cd8d [file] [log] [blame]
Chris Lattnerc9d5d2c2007-11-01 06:49:54 +00001<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01//EN"
2 "http://www.w3.org/TR/html4/strict.dtd">
3
4<html>
5<head>
Chris Lattner58f2c872007-11-02 05:42:52 +00006 <title>Kaleidoscope: Extending the Language: User-defined Operators</title>
Chris Lattnerc9d5d2c2007-11-01 06:49:54 +00007 <meta http-equiv="Content-Type" content="text/html; charset=utf-8">
8 <meta name="author" content="Chris Lattner">
9 <link rel="stylesheet" href="../llvm.css" type="text/css">
10</head>
11
12<body>
13
Chris Lattner58f2c872007-11-02 05:42:52 +000014<div class="doc_title">Kaleidoscope: Extending the Language: User-defined Operators</div>
Chris Lattnerc9d5d2c2007-11-01 06:49:54 +000015
Chris Lattner128eb862007-11-05 19:06:59 +000016<ul>
Chris Lattner0e555b12007-11-05 20:04:56 +000017<li><a href="index.html">Up to Tutorial Index</a></li>
Chris Lattner128eb862007-11-05 19:06:59 +000018<li>Chapter 6
19 <ol>
20 <li><a href="#intro">Chapter 6 Introduction</a></li>
21 <li><a href="#idea">User-defined Operators: the Idea</a></li>
22 <li><a href="#binary">User-defined Binary Operators</a></li>
23 <li><a href="#unary">User-defined Unary Operators</a></li>
24 <li><a href="#example">Kicking the Tires</a></li>
25 <li><a href="#code">Full Code Listing</a></li>
26 </ol>
27</li>
Chris Lattner0e555b12007-11-05 20:04:56 +000028<li><a href="LangImpl7.html">Chapter 7</a>: Extending the Language: Mutable
29Variables / SSA Construction</li>
Chris Lattner128eb862007-11-05 19:06:59 +000030</ul>
31
Chris Lattnerc9d5d2c2007-11-01 06:49:54 +000032<div class="doc_author">
33 <p>Written by <a href="mailto:sabre@nondot.org">Chris Lattner</a></p>
34</div>
35
36<!-- *********************************************************************** -->
Chris Lattner128eb862007-11-05 19:06:59 +000037<div class="doc_section"><a name="intro">Chapter 6 Introduction</a></div>
Chris Lattnerc9d5d2c2007-11-01 06:49:54 +000038<!-- *********************************************************************** -->
39
40<div class="doc_text">
41
Chris Lattner128eb862007-11-05 19:06:59 +000042<p>Welcome to Chapter 6 of the "<a href="index.html">Implementing a language
43with LLVM</a>" tutorial. At this point in our tutorial, we now have a fully
Chris Lattnerb7e6b1a2007-11-15 04:51:31 +000044functional language that is fairly minimal, but also useful. There
45is still one big problem with it, however. Our language doesn't have many
46useful operators (like division, logical negation, or even any comparisons
47besides less-than).</p>
Chris Lattnerc9d5d2c2007-11-01 06:49:54 +000048
Chris Lattner58f2c872007-11-02 05:42:52 +000049<p>This chapter of the tutorial takes a wild digression into adding user-defined
Chris Lattnerb7e6b1a2007-11-15 04:51:31 +000050operators to the simple and beautiful Kaleidoscope language. This digression now gives
51us a simple and ugly language in some ways, but also a powerful one at the same time.
Chris Lattnerc9d5d2c2007-11-01 06:49:54 +000052One of the great things about creating your own language is that you get to
Chris Lattnerb7e6b1a2007-11-15 04:51:31 +000053decide what is good or bad. In this tutorial we'll assume that it is okay to
Chris Lattnerc9d5d2c2007-11-01 06:49:54 +000054use this as a way to show some interesting parsing techniques.</p>
55
Chris Lattner3616a8a2007-11-07 06:06:38 +000056<p>At the end of this tutorial, we'll run through an example Kaleidoscope
57application that <a href="#example">renders the Mandelbrot set</a>. This gives
58an example of what you can build with Kaleidoscope and its feature set.</p>
Chris Lattner58f2c872007-11-02 05:42:52 +000059
Chris Lattnerc9d5d2c2007-11-01 06:49:54 +000060</div>
61
62<!-- *********************************************************************** -->
Chris Lattner58f2c872007-11-02 05:42:52 +000063<div class="doc_section"><a name="idea">User-defined Operators: the Idea</a></div>
Chris Lattnerc9d5d2c2007-11-01 06:49:54 +000064<!-- *********************************************************************** -->
65
66<div class="doc_text">
67
68<p>
Chris Lattner58f2c872007-11-02 05:42:52 +000069The "operator overloading" that we will add to Kaleidoscope is more general than
Chris Lattnerc9d5d2c2007-11-01 06:49:54 +000070languages like C++. In C++, you are only allowed to redefine existing
71operators: you can't programatically change the grammar, introduce new
72operators, change precedence levels, etc. In this chapter, we will add this
Chris Lattner3616a8a2007-11-07 06:06:38 +000073capability to Kaleidoscope, which will let the user round out the set of
74operators that are supported.</p>
Chris Lattnerc9d5d2c2007-11-01 06:49:54 +000075
Chris Lattner58f2c872007-11-02 05:42:52 +000076<p>The point of going into user-defined operators in a tutorial like this is to
Chris Lattnerb7e6b1a2007-11-15 04:51:31 +000077show the power and flexibility of using a hand-written parser. Thus far, the parser
78we have been implementing uses recursive descent for most parts of the grammar and
Chris Lattnerc9d5d2c2007-11-01 06:49:54 +000079operator precedence parsing for the expressions. See <a
80href="LangImpl2.html">Chapter 2</a> for details. Without using operator
81precedence parsing, it would be very difficult to allow the programmer to
82introduce new operators into the grammar: the grammar is dynamically extensible
83as the JIT runs.</p>
84
85<p>The two specific features we'll add are programmable unary operators (right
86now, Kaleidoscope has no unary operators at all) as well as binary operators.
87An example of this is:</p>
88
89<div class="doc_code">
90<pre>
91# Logical unary not.
92def unary!(v)
93 if v then
94 0
95 else
96 1;
97
98# Define &gt; with the same precedence as &lt;.
99def binary&gt; 10 (LHS RHS)
Chris Lattner61ad4492007-11-23 22:19:33 +0000100 RHS &lt; LHS;
Chris Lattnerc9d5d2c2007-11-01 06:49:54 +0000101
102# Binary "logical or", (note that it does not "short circuit")
103def binary| 5 (LHS RHS)
104 if LHS then
105 1
106 else if RHS then
107 1
108 else
109 0;
110
111# Define = with slightly lower precedence than relationals.
112def binary= 9 (LHS RHS)
113 !(LHS &lt; RHS | LHS &gt; RHS);
114</pre>
115</div>
116
117<p>Many languages aspire to being able to implement their standard runtime
118library in the language itself. In Kaleidoscope, we can implement significant
119parts of the language in the library!</p>
120
121<p>We will break down implementation of these features into two parts:
Chris Lattner58f2c872007-11-02 05:42:52 +0000122implementing support for user-defined binary operators and adding unary
Chris Lattnerc9d5d2c2007-11-01 06:49:54 +0000123operators.</p>
124
125</div>
126
127<!-- *********************************************************************** -->
Chris Lattner58f2c872007-11-02 05:42:52 +0000128<div class="doc_section"><a name="binary">User-defined Binary Operators</a></div>
Chris Lattnerc9d5d2c2007-11-01 06:49:54 +0000129<!-- *********************************************************************** -->
130
131<div class="doc_text">
132
Chris Lattner58f2c872007-11-02 05:42:52 +0000133<p>Adding support for user-defined binary operators is pretty simple with our
Chris Lattnerc9d5d2c2007-11-01 06:49:54 +0000134current framework. We'll first add support for the unary/binary keywords:</p>
135
136<div class="doc_code">
137<pre>
138enum Token {
139 ...
140 <b>// operators
141 tok_binary = -11, tok_unary = -12</b>
142};
143...
144static int gettok() {
145...
146 if (IdentifierStr == "for") return tok_for;
147 if (IdentifierStr == "in") return tok_in;
148 <b>if (IdentifierStr == "binary") return tok_binary;
149 if (IdentifierStr == "unary") return tok_unary;</b>
150 return tok_identifier;
151</pre>
152</div>
153
154<p>This just adds lexer support for the unary and binary keywords, like we
155did in <a href="LangImpl5.html#iflexer">previous chapters</a>. One nice thing
Chris Lattnerb7e6b1a2007-11-15 04:51:31 +0000156about our current AST, is that we represent binary operators with full generalisation
157by using their ASCII code as the opcode. For our extended operators, we'll use this
Chris Lattnerc9d5d2c2007-11-01 06:49:54 +0000158same representation, so we don't need any new AST or parser support.</p>
159
160<p>On the other hand, we have to be able to represent the definitions of these
Chris Lattnerb7e6b1a2007-11-15 04:51:31 +0000161new operators, in the "def binary| 5" part of the function definition. In our
Chris Lattnerc9d5d2c2007-11-01 06:49:54 +0000162grammar so far, the "name" for the function definition is parsed as the
163"prototype" production and into the <tt>PrototypeAST</tt> AST node. To
164represent our new user-defined operators as prototypes, we have to extend
165the <tt>PrototypeAST</tt> AST node like this:</p>
166
167<div class="doc_code">
168<pre>
169/// PrototypeAST - This class represents the "prototype" for a function,
170/// which captures its argument names as well as if it is an operator.
171class PrototypeAST {
172 std::string Name;
173 std::vector&lt;std::string&gt; Args;
174 <b>bool isOperator;
175 unsigned Precedence; // Precedence if a binary op.</b>
176public:
177 PrototypeAST(const std::string &amp;name, const std::vector&lt;std::string&gt; &amp;args,
178 <b>bool isoperator = false, unsigned prec = 0</b>)
179 : Name(name), Args(args), <b>isOperator(isoperator), Precedence(prec)</b> {}
180
181 <b>bool isUnaryOp() const { return isOperator &amp;&amp; Args.size() == 1; }
182 bool isBinaryOp() const { return isOperator &amp;&amp; Args.size() == 2; }
183
184 char getOperatorName() const {
185 assert(isUnaryOp() || isBinaryOp());
186 return Name[Name.size()-1];
187 }
188
189 unsigned getBinaryPrecedence() const { return Precedence; }</b>
190
191 Function *Codegen();
192};
193</pre>
194</div>
195
196<p>Basically, in addition to knowing a name for the prototype, we now keep track
197of whether it was an operator, and if it was, what precedence level the operator
Chris Lattner58f2c872007-11-02 05:42:52 +0000198is at. The precedence is only used for binary operators (as you'll see below,
199it just doesn't apply for unary operators). Now that we have a way to represent
200the prototype for a user-defined operator, we need to parse it:</p>
201
202<div class="doc_code">
203<pre>
204/// prototype
205/// ::= id '(' id* ')'
206<b>/// ::= binary LETTER number? (id, id)</b>
207static PrototypeAST *ParsePrototype() {
208 std::string FnName;
209
210 <b>int Kind = 0; // 0 = identifier, 1 = unary, 2 = binary.
211 unsigned BinaryPrecedence = 30;</b>
212
213 switch (CurTok) {
214 default:
215 return ErrorP("Expected function name in prototype");
216 case tok_identifier:
217 FnName = IdentifierStr;
218 Kind = 0;
219 getNextToken();
220 break;
221 <b>case tok_binary:
222 getNextToken();
223 if (!isascii(CurTok))
224 return ErrorP("Expected binary operator");
225 FnName = "binary";
226 FnName += (char)CurTok;
227 Kind = 2;
228 getNextToken();
229
230 // Read the precedence if present.
231 if (CurTok == tok_number) {
232 if (NumVal &lt; 1 || NumVal &gt; 100)
233 return ErrorP("Invalid precedecnce: must be 1..100");
234 BinaryPrecedence = (unsigned)NumVal;
235 getNextToken();
236 }
237 break;</b>
238 }
239
240 if (CurTok != '(')
241 return ErrorP("Expected '(' in prototype");
242
243 std::vector&lt;std::string&gt; ArgNames;
244 while (getNextToken() == tok_identifier)
245 ArgNames.push_back(IdentifierStr);
246 if (CurTok != ')')
247 return ErrorP("Expected ')' in prototype");
248
249 // success.
250 getNextToken(); // eat ')'.
251
252 <b>// Verify right number of names for operator.
253 if (Kind &amp;&amp; ArgNames.size() != Kind)
254 return ErrorP("Invalid number of operands for operator");
255
256 return new PrototypeAST(FnName, ArgNames, Kind != 0, BinaryPrecedence);</b>
257}
258</pre>
259</div>
260
Chris Lattnerb7e6b1a2007-11-15 04:51:31 +0000261<p>This is all fairly straightforward parsing code, and we have already seen
262a lot of similar code in the past. One interesting part about the code above is
263the couple lines that set up <tt>FnName</tt> for binary operators. This builds names
264like "binary@" for a newly defined "@" operator. This then takes advantage of the
265fact that symbol names in the LLVM symbol table are allowed to have any character in
266them, including embedded nul characters.</p>
Chris Lattner58f2c872007-11-02 05:42:52 +0000267
Chris Lattnerb7e6b1a2007-11-15 04:51:31 +0000268<p>The next interesting thing to add, is codegen support for these binary operators.
Chris Lattner58f2c872007-11-02 05:42:52 +0000269Given our current structure, this is a simple addition of a default case for our
270existing binary operator node:</p>
271
272<div class="doc_code">
273<pre>
274Value *BinaryExprAST::Codegen() {
275 Value *L = LHS-&gt;Codegen();
276 Value *R = RHS-&gt;Codegen();
277 if (L == 0 || R == 0) return 0;
278
279 switch (Op) {
280 case '+': return Builder.CreateAdd(L, R, "addtmp");
281 case '-': return Builder.CreateSub(L, R, "subtmp");
282 case '*': return Builder.CreateMul(L, R, "multmp");
283 case '&lt;':
Chris Lattner71155212007-11-06 01:39:12 +0000284 L = Builder.CreateFCmpULT(L, R, "cmptmp");
Chris Lattner58f2c872007-11-02 05:42:52 +0000285 // Convert bool 0/1 to double 0.0 or 1.0
286 return Builder.CreateUIToFP(L, Type::DoubleTy, "booltmp");
287 <b>default: break;</b>
288 }
289
290 <b>// If it wasn't a builtin binary operator, it must be a user defined one. Emit
291 // a call to it.
292 Function *F = TheModule-&gt;getFunction(std::string("binary")+Op);
293 assert(F &amp;&amp; "binary operator not found!");
294
295 Value *Ops[] = { L, R };
296 return Builder.CreateCall(F, Ops, Ops+2, "binop");</b>
297}
298
299</pre>
300</div>
301
302<p>As you can see above, the new code is actually really simple. It just does
303a lookup for the appropriate operator in the symbol table and generates a
304function call to it. Since user-defined operators are just built as normal
Chris Lattnerb7e6b1a2007-11-15 04:51:31 +0000305functions (because the "prototype" boils down to a function with the right
Chris Lattner58f2c872007-11-02 05:42:52 +0000306name) everything falls into place.</p>
307
Chris Lattnerb7e6b1a2007-11-15 04:51:31 +0000308<p>The final piece of code we are missing, is a bit of top level magic:</p>
Chris Lattner58f2c872007-11-02 05:42:52 +0000309
310<div class="doc_code">
311<pre>
312Function *FunctionAST::Codegen() {
313 NamedValues.clear();
314
315 Function *TheFunction = Proto->Codegen();
316 if (TheFunction == 0)
317 return 0;
318
319 <b>// If this is an operator, install it.
320 if (Proto-&gt;isBinaryOp())
321 BinopPrecedence[Proto->getOperatorName()] = Proto->getBinaryPrecedence();</b>
322
323 // Create a new basic block to start insertion into.
324 BasicBlock *BB = new BasicBlock("entry", TheFunction);
325 Builder.SetInsertPoint(BB);
326
327 if (Value *RetVal = Body-&gt;Codegen()) {
328 ...
329</pre>
330</div>
331
332<p>Basically, before codegening a function, if it is a user-defined operator, we
333register it in the precedence table. This allows the binary operator parsing
Chris Lattnerb7e6b1a2007-11-15 04:51:31 +0000334logic we already have in place to handle it. Since we are working on a fully-general operator precedence parser, this is all we need to do to "extend the grammar".</p>
Chris Lattner58f2c872007-11-02 05:42:52 +0000335
Chris Lattnerb7e6b1a2007-11-15 04:51:31 +0000336<p>Now we have useful user-defined binary operators. This builds a lot
Chris Lattner58f2c872007-11-02 05:42:52 +0000337on the previous framework we built for other operators. Adding unary operators
Chris Lattner3616a8a2007-11-07 06:06:38 +0000338is a bit more challenging, because we don't have any framework for it yet - lets
Chris Lattner58f2c872007-11-02 05:42:52 +0000339see what it takes.</p>
340
341</div>
342
343<!-- *********************************************************************** -->
344<div class="doc_section"><a name="unary">User-defined Unary Operators</a></div>
345<!-- *********************************************************************** -->
346
347<div class="doc_text">
348
349<p>Since we don't currently support unary operators in the Kaleidoscope
Chris Lattnerb7e6b1a2007-11-15 04:51:31 +0000350language, we'll need to add everything to support them. Above, we added simple
Chris Lattner58f2c872007-11-02 05:42:52 +0000351support for the 'unary' keyword to the lexer. In addition to that, we need an
352AST node:</p>
353
354<div class="doc_code">
355<pre>
356/// UnaryExprAST - Expression class for a unary operator.
357class UnaryExprAST : public ExprAST {
358 char Opcode;
359 ExprAST *Operand;
360public:
361 UnaryExprAST(char opcode, ExprAST *operand)
362 : Opcode(opcode), Operand(operand) {}
363 virtual Value *Codegen();
364};
365</pre>
366</div>
367
368<p>This AST node is very simple and obvious by now. It directly mirrors the
369binary operator AST node, except that it only has one child. With this, we
370need to add the parsing logic. Parsing a unary operator is pretty simple: we'll
371add a new function to do it:</p>
372
373<div class="doc_code">
374<pre>
375/// unary
376/// ::= primary
377/// ::= '!' unary
378static ExprAST *ParseUnary() {
379 // If the current token is not an operator, it must be a primary expr.
380 if (!isascii(CurTok) || CurTok == '(' || CurTok == ',')
381 return ParsePrimary();
382
383 // If this is a unary operator, read it.
384 int Opc = CurTok;
385 getNextToken();
386 if (ExprAST *Operand = ParseUnary())
387 return new UnaryExprAST(Opc, Operand);
388 return 0;
389}
390</pre>
391</div>
392
Chris Lattnerb7e6b1a2007-11-15 04:51:31 +0000393<p>The grammar we add is pretty straightforward here. If we see a unary
Chris Lattner58f2c872007-11-02 05:42:52 +0000394operator when parsing a primary operator, we eat the operator as a prefix and
395parse the remaining piece as another unary operator. This allows us to handle
396multiple unary operators (e.g. "!!x"). Note that unary operators can't have
397ambiguous parses like binary operators can, so there is no need for precedence
398information.</p>
399
Chris Lattnerb7e6b1a2007-11-15 04:51:31 +0000400<p>The problem with this function, is that we need to call ParseUnary from somewhere.
Chris Lattner58f2c872007-11-02 05:42:52 +0000401To do this, we change previous callers of ParsePrimary to call ParseUnary
402instead:</p>
403
404<div class="doc_code">
405<pre>
406/// binoprhs
407/// ::= ('+' unary)*
408static ExprAST *ParseBinOpRHS(int ExprPrec, ExprAST *LHS) {
409 ...
410 <b>// Parse the unary expression after the binary operator.
411 ExprAST *RHS = ParseUnary();
412 if (!RHS) return 0;</b>
413 ...
414}
415/// expression
416/// ::= unary binoprhs
417///
418static ExprAST *ParseExpression() {
419 <b>ExprAST *LHS = ParseUnary();</b>
420 if (!LHS) return 0;
421
422 return ParseBinOpRHS(0, LHS);
423}
424</pre>
425</div>
426
Chris Lattnerb7e6b1a2007-11-15 04:51:31 +0000427<p>With these two simple changes, we are now able to parse unary operators and build the
Chris Lattner58f2c872007-11-02 05:42:52 +0000428AST for them. Next up, we need to add parser support for prototypes, to parse
429the unary operator prototype. We extend the binary operator code above
430with:</p>
431
432<div class="doc_code">
433<pre>
434/// prototype
435/// ::= id '(' id* ')'
436/// ::= binary LETTER number? (id, id)
437<b>/// ::= unary LETTER (id)</b>
438static PrototypeAST *ParsePrototype() {
439 std::string FnName;
440
441 int Kind = 0; // 0 = identifier, 1 = unary, 2 = binary.
442 unsigned BinaryPrecedence = 30;
443
444 switch (CurTok) {
445 default:
446 return ErrorP("Expected function name in prototype");
447 case tok_identifier:
448 FnName = IdentifierStr;
449 Kind = 0;
450 getNextToken();
451 break;
452 <b>case tok_unary:
453 getNextToken();
454 if (!isascii(CurTok))
455 return ErrorP("Expected unary operator");
456 FnName = "unary";
457 FnName += (char)CurTok;
458 Kind = 1;
459 getNextToken();
460 break;</b>
461 case tok_binary:
462 ...
463</pre>
464</div>
465
466<p>As with binary operators, we name unary operators with a name that includes
467the operator character. This assists us at code generation time. Speaking of,
468the final piece we need to add is codegen support for unary operators. It looks
469like this:</p>
470
471<div class="doc_code">
472<pre>
473Value *UnaryExprAST::Codegen() {
474 Value *OperandV = Operand->Codegen();
475 if (OperandV == 0) return 0;
476
477 Function *F = TheModule->getFunction(std::string("unary")+Opcode);
478 if (F == 0)
479 return ErrorV("Unknown unary operator");
480
481 return Builder.CreateCall(F, OperandV, "unop");
482}
483</pre>
484</div>
485
486<p>This code is similar to, but simpler than, the code for binary operators. It
487is simpler primarily because it doesn't need to handle any predefined operators.
488</p>
489
490</div>
491
492<!-- *********************************************************************** -->
493<div class="doc_section"><a name="example">Kicking the Tires</a></div>
494<!-- *********************************************************************** -->
495
496<div class="doc_text">
497
498<p>It is somewhat hard to believe, but with a few simple extensions we've
Chris Lattnerb5d81b32007-11-02 05:54:25 +0000499covered in the last chapters, we have grown a real-ish language. With this, we
Chris Lattner58f2c872007-11-02 05:42:52 +0000500can do a lot of interesting things, including I/O, math, and a bunch of other
501things. For example, we can now add a nice sequencing operator (printd is
502defined to print out the specified value and a newline):</p>
503
504<div class="doc_code">
505<pre>
506ready&gt; <b>extern printd(x);</b>
507Read extern: declare double @printd(double)
508ready&gt; <b>def binary : 1 (x y) 0; # Low-precedence operator that ignores operands.</b>
509..
510ready&gt; <b>printd(123) : printd(456) : printd(789);</b>
511123.000000
512456.000000
513789.000000
514Evaluated to 0.000000
515</pre>
516</div>
517
Chris Lattnerb5d81b32007-11-02 05:54:25 +0000518<p>We can also define a bunch of other "primitive" operations, such as:</p>
Chris Lattner58f2c872007-11-02 05:42:52 +0000519
520<div class="doc_code">
521<pre>
522# Logical unary not.
523def unary!(v)
524 if v then
525 0
526 else
527 1;
528
529# Unary negate.
530def unary-(v)
531 0-v;
532
Chris Lattner3616a8a2007-11-07 06:06:38 +0000533# Define &gt; with the same precedence as &gt;.
Chris Lattner58f2c872007-11-02 05:42:52 +0000534def binary&gt; 10 (LHS RHS)
Chris Lattner61ad4492007-11-23 22:19:33 +0000535 RHS &lt; LHS;
Chris Lattner58f2c872007-11-02 05:42:52 +0000536
537# Binary logical or, which does not short circuit.
538def binary| 5 (LHS RHS)
539 if LHS then
540 1
541 else if RHS then
542 1
543 else
544 0;
545
546# Binary logical and, which does not short circuit.
547def binary&amp; 6 (LHS RHS)
548 if !LHS then
549 0
550 else
551 !!RHS;
552
553# Define = with slightly lower precedence than relationals.
554def binary = 9 (LHS RHS)
555 !(LHS &lt; RHS | LHS &gt; RHS);
556
557</pre>
558</div>
Chris Lattnerc9d5d2c2007-11-01 06:49:54 +0000559
560
Chris Lattner58f2c872007-11-02 05:42:52 +0000561<p>Given the previous if/then/else support, we can also define interesting
562functions for I/O. For example, the following prints out a character whose
563"density" reflects the value passed in: the lower the value, the denser the
564character:</p>
565
566<div class="doc_code">
567<pre>
568ready&gt;
569<b>
570extern putchard(char)
571def printdensity(d)
572 if d &gt; 8 then
573 putchard(32) # ' '
574 else if d &gt; 4 then
575 putchard(46) # '.'
576 else if d &gt; 2 then
577 putchard(43) # '+'
578 else
579 putchard(42); # '*'</b>
580...
581ready&gt; <b>printdensity(1): printdensity(2): printdensity(3) :
582 printdensity(4): printdensity(5): printdensity(9): putchard(10);</b>
583*++..
584Evaluated to 0.000000
585</pre>
586</div>
587
588<p>Based on these simple primitive operations, we can start to define more
589interesting things. For example, here's a little function that solves for the
Chris Lattnerb7e6b1a2007-11-15 04:51:31 +0000590number of iterations it takes a function in the complex plane to
Chris Lattner58f2c872007-11-02 05:42:52 +0000591converge:</p>
592
593<div class="doc_code">
594<pre>
595# determine whether the specific location diverges.
596# Solve for z = z^2 + c in the complex plane.
597def mandleconverger(real imag iters creal cimag)
598 if iters &gt; 255 | (real*real + imag*imag &gt; 4) then
599 iters
600 else
601 mandleconverger(real*real - imag*imag + creal,
602 2*real*imag + cimag,
603 iters+1, creal, cimag);
604
605# return the number of iterations required for the iteration to escape
606def mandleconverge(real imag)
607 mandleconverger(real, imag, 0, real, imag);
608</pre>
609</div>
610
Chris Lattnerb5d81b32007-11-02 05:54:25 +0000611<p>This "z = z<sup>2</sup> + c" function is a beautiful little creature that is the basis
Chris Lattner58f2c872007-11-02 05:42:52 +0000612for computation of the <a
613href="http://en.wikipedia.org/wiki/Mandelbrot_set">Mandelbrot Set</a>. Our
614<tt>mandelconverge</tt> function returns the number of iterations that it takes
615for a complex orbit to escape, saturating to 255. This is not a very useful
616function by itself, but if you plot its value over a two-dimensional plane,
Chris Lattner3616a8a2007-11-07 06:06:38 +0000617you can see the Mandelbrot set. Given that we are limited to using putchard
Chris Lattner58f2c872007-11-02 05:42:52 +0000618here, our amazing graphical output is limited, but we can whip together
619something using the density plotter above:</p>
620
621<div class="doc_code">
622<pre>
Chris Lattner3616a8a2007-11-07 06:06:38 +0000623# compute and plot the mandlebrot set with the specified 2 dimensional range
Chris Lattner58f2c872007-11-02 05:42:52 +0000624# info.
625def mandelhelp(xmin xmax xstep ymin ymax ystep)
626 for y = ymin, y &lt; ymax, ystep in (
627 (for x = xmin, x &lt; xmax, xstep in
628 printdensity(mandleconverge(x,y)))
629 : putchard(10)
630 )
631
632# mandel - This is a convenient helper function for ploting the mandelbrot set
Chris Lattner3616a8a2007-11-07 06:06:38 +0000633# from the specified position with the specified Magnification.
Chris Lattner58f2c872007-11-02 05:42:52 +0000634def mandel(realstart imagstart realmag imagmag)
635 mandelhelp(realstart, realstart+realmag*78, realmag,
636 imagstart, imagstart+imagmag*40, imagmag);
637</pre>
638</div>
639
640<p>Given this, we can try plotting out the mandlebrot set! Lets try it out:</p>
641
642<div class="doc_code">
643<pre>
644ready&gt; <b>mandel(-2.3, -1.3, 0.05, 0.07);</b>
645*******************************+++++++++++*************************************
646*************************+++++++++++++++++++++++*******************************
647**********************+++++++++++++++++++++++++++++****************************
648*******************+++++++++++++++++++++.. ...++++++++*************************
649*****************++++++++++++++++++++++.... ...+++++++++***********************
650***************+++++++++++++++++++++++..... ...+++++++++*********************
651**************+++++++++++++++++++++++.... ....+++++++++********************
652*************++++++++++++++++++++++...... .....++++++++*******************
653************+++++++++++++++++++++....... .......+++++++******************
654***********+++++++++++++++++++.... ... .+++++++*****************
655**********+++++++++++++++++....... .+++++++****************
656*********++++++++++++++........... ...+++++++***************
657********++++++++++++............ ...++++++++**************
658********++++++++++... .......... .++++++++**************
659*******+++++++++..... .+++++++++*************
660*******++++++++...... ..+++++++++*************
661*******++++++....... ..+++++++++*************
662*******+++++...... ..+++++++++*************
663*******.... .... ...+++++++++*************
664*******.... . ...+++++++++*************
665*******+++++...... ...+++++++++*************
666*******++++++....... ..+++++++++*************
667*******++++++++...... .+++++++++*************
668*******+++++++++..... ..+++++++++*************
669********++++++++++... .......... .++++++++**************
670********++++++++++++............ ...++++++++**************
671*********++++++++++++++.......... ...+++++++***************
672**********++++++++++++++++........ .+++++++****************
673**********++++++++++++++++++++.... ... ..+++++++****************
674***********++++++++++++++++++++++....... .......++++++++*****************
675************+++++++++++++++++++++++...... ......++++++++******************
676**************+++++++++++++++++++++++.... ....++++++++********************
677***************+++++++++++++++++++++++..... ...+++++++++*********************
678*****************++++++++++++++++++++++.... ...++++++++***********************
679*******************+++++++++++++++++++++......++++++++*************************
680*********************++++++++++++++++++++++.++++++++***************************
681*************************+++++++++++++++++++++++*******************************
682******************************+++++++++++++************************************
683*******************************************************************************
684*******************************************************************************
685*******************************************************************************
686Evaluated to 0.000000
687ready&gt; <b>mandel(-2, -1, 0.02, 0.04);</b>
688**************************+++++++++++++++++++++++++++++++++++++++++++++++++++++
689***********************++++++++++++++++++++++++++++++++++++++++++++++++++++++++
690*********************+++++++++++++++++++++++++++++++++++++++++++++++++++++++++.
691*******************+++++++++++++++++++++++++++++++++++++++++++++++++++++++++...
692*****************+++++++++++++++++++++++++++++++++++++++++++++++++++++++++.....
693***************++++++++++++++++++++++++++++++++++++++++++++++++++++++++........
694**************++++++++++++++++++++++++++++++++++++++++++++++++++++++...........
695************+++++++++++++++++++++++++++++++++++++++++++++++++++++..............
696***********++++++++++++++++++++++++++++++++++++++++++++++++++........ .
697**********++++++++++++++++++++++++++++++++++++++++++++++.............
698********+++++++++++++++++++++++++++++++++++++++++++..................
699*******+++++++++++++++++++++++++++++++++++++++.......................
700******+++++++++++++++++++++++++++++++++++...........................
701*****++++++++++++++++++++++++++++++++............................
702*****++++++++++++++++++++++++++++...............................
703****++++++++++++++++++++++++++...... .........................
704***++++++++++++++++++++++++......... ...... ...........
705***++++++++++++++++++++++............
706**+++++++++++++++++++++..............
707**+++++++++++++++++++................
708*++++++++++++++++++.................
709*++++++++++++++++............ ...
710*++++++++++++++..............
711*+++....++++................
712*.......... ...........
713*
714*.......... ...........
715*+++....++++................
716*++++++++++++++..............
717*++++++++++++++++............ ...
718*++++++++++++++++++.................
719**+++++++++++++++++++................
720**+++++++++++++++++++++..............
721***++++++++++++++++++++++............
722***++++++++++++++++++++++++......... ...... ...........
723****++++++++++++++++++++++++++...... .........................
724*****++++++++++++++++++++++++++++...............................
725*****++++++++++++++++++++++++++++++++............................
726******+++++++++++++++++++++++++++++++++++...........................
727*******+++++++++++++++++++++++++++++++++++++++.......................
728********+++++++++++++++++++++++++++++++++++++++++++..................
729Evaluated to 0.000000
730ready&gt; <b>mandel(-0.9, -1.4, 0.02, 0.03);</b>
731*******************************************************************************
732*******************************************************************************
733*******************************************************************************
734**********+++++++++++++++++++++************************************************
735*+++++++++++++++++++++++++++++++++++++++***************************************
736+++++++++++++++++++++++++++++++++++++++++++++**********************************
737++++++++++++++++++++++++++++++++++++++++++++++++++*****************************
738++++++++++++++++++++++++++++++++++++++++++++++++++++++*************************
739+++++++++++++++++++++++++++++++++++++++++++++++++++++++++**********************
740+++++++++++++++++++++++++++++++++.........++++++++++++++++++*******************
741+++++++++++++++++++++++++++++++.... ......+++++++++++++++++++****************
742+++++++++++++++++++++++++++++....... ........+++++++++++++++++++**************
743++++++++++++++++++++++++++++........ ........++++++++++++++++++++************
744+++++++++++++++++++++++++++......... .. ...+++++++++++++++++++++**********
745++++++++++++++++++++++++++........... ....++++++++++++++++++++++********
746++++++++++++++++++++++++............. .......++++++++++++++++++++++******
747+++++++++++++++++++++++............. ........+++++++++++++++++++++++****
748++++++++++++++++++++++........... ..........++++++++++++++++++++++***
749++++++++++++++++++++........... .........++++++++++++++++++++++*
750++++++++++++++++++............ ...........++++++++++++++++++++
751++++++++++++++++............... .............++++++++++++++++++
752++++++++++++++................. ...............++++++++++++++++
753++++++++++++.................. .................++++++++++++++
754+++++++++.................. .................+++++++++++++
755++++++........ . ......... ..++++++++++++
756++............ ...... ....++++++++++
757.............. ...++++++++++
758.............. ....+++++++++
759.............. .....++++++++
760............. ......++++++++
761........... .......++++++++
762......... ........+++++++
763......... ........+++++++
764......... ....+++++++
765........ ...+++++++
766....... ...+++++++
767 ....+++++++
768 .....+++++++
769 ....+++++++
770 ....+++++++
771 ....+++++++
772Evaluated to 0.000000
773ready&gt; <b>^D</b>
774</pre>
775</div>
776
777<p>At this point, you may be starting to realize that Kaleidoscope is a real
778and powerful language. It may not be self-similar :), but it can be used to
779plot things that are!</p>
780
781<p>With this, we conclude the "adding user-defined operators" chapter of the
Chris Lattnerb7e6b1a2007-11-15 04:51:31 +0000782tutorial. We have successfully augmented our language, adding the ability to extend the
783language in the library, and we have shown how this can be used to build a simple but
Chris Lattner3616a8a2007-11-07 06:06:38 +0000784interesting end-user application in Kaleidoscope. At this point, Kaleidoscope
Chris Lattner58f2c872007-11-02 05:42:52 +0000785can build a variety of applications that are functional and can call functions
Chris Lattnerb5d81b32007-11-02 05:54:25 +0000786with side-effects, but it can't actually define and mutate a variable itself.
Chris Lattner58f2c872007-11-02 05:42:52 +0000787</p>
788
Chris Lattner3616a8a2007-11-07 06:06:38 +0000789<p>Strikingly, variable mutation is an important feature of some
Chris Lattner58f2c872007-11-02 05:42:52 +0000790languages, and it is not at all obvious how to <a href="LangImpl7.html">add
791support for mutable variables</a> without having to add an "SSA construction"
792phase to your front-end. In the next chapter, we will describe how you can
Chris Lattnerb7e6b1a2007-11-15 04:51:31 +0000793add variable mutation without building SSA in your front-end.</p>
Chris Lattnerc9d5d2c2007-11-01 06:49:54 +0000794
795</div>
796
797
798<!-- *********************************************************************** -->
799<div class="doc_section"><a name="code">Full Code Listing</a></div>
800<!-- *********************************************************************** -->
801
802<div class="doc_text">
803
804<p>
805Here is the complete code listing for our running example, enhanced with the
806if/then/else and for expressions.. To build this example, use:
807</p>
808
809<div class="doc_code">
810<pre>
811 # Compile
812 g++ -g toy.cpp `llvm-config --cppflags --ldflags --libs core jit native` -O3 -o toy
813 # Run
814 ./toy
815</pre>
816</div>
817
818<p>Here is the code:</p>
819
820<div class="doc_code">
821<pre>
Chris Lattner58f2c872007-11-02 05:42:52 +0000822#include "llvm/DerivedTypes.h"
823#include "llvm/ExecutionEngine/ExecutionEngine.h"
824#include "llvm/Module.h"
825#include "llvm/ModuleProvider.h"
826#include "llvm/PassManager.h"
827#include "llvm/Analysis/Verifier.h"
828#include "llvm/Target/TargetData.h"
829#include "llvm/Transforms/Scalar.h"
830#include "llvm/Support/LLVMBuilder.h"
831#include &lt;cstdio&gt;
832#include &lt;string&gt;
833#include &lt;map&gt;
834#include &lt;vector&gt;
835using namespace llvm;
836
837//===----------------------------------------------------------------------===//
838// Lexer
839//===----------------------------------------------------------------------===//
840
841// The lexer returns tokens [0-255] if it is an unknown character, otherwise one
842// of these for known things.
843enum Token {
844 tok_eof = -1,
845
846 // commands
847 tok_def = -2, tok_extern = -3,
848
849 // primary
850 tok_identifier = -4, tok_number = -5,
851
852 // control
853 tok_if = -6, tok_then = -7, tok_else = -8,
854 tok_for = -9, tok_in = -10,
855
856 // operators
857 tok_binary = -11, tok_unary = -12
858};
859
860static std::string IdentifierStr; // Filled in if tok_identifier
861static double NumVal; // Filled in if tok_number
862
863/// gettok - Return the next token from standard input.
864static int gettok() {
865 static int LastChar = ' ';
866
867 // Skip any whitespace.
868 while (isspace(LastChar))
869 LastChar = getchar();
870
871 if (isalpha(LastChar)) { // identifier: [a-zA-Z][a-zA-Z0-9]*
872 IdentifierStr = LastChar;
873 while (isalnum((LastChar = getchar())))
874 IdentifierStr += LastChar;
875
876 if (IdentifierStr == "def") return tok_def;
877 if (IdentifierStr == "extern") return tok_extern;
878 if (IdentifierStr == "if") return tok_if;
879 if (IdentifierStr == "then") return tok_then;
880 if (IdentifierStr == "else") return tok_else;
881 if (IdentifierStr == "for") return tok_for;
882 if (IdentifierStr == "in") return tok_in;
883 if (IdentifierStr == "binary") return tok_binary;
884 if (IdentifierStr == "unary") return tok_unary;
885 return tok_identifier;
886 }
887
888 if (isdigit(LastChar) || LastChar == '.') { // Number: [0-9.]+
889 std::string NumStr;
890 do {
891 NumStr += LastChar;
892 LastChar = getchar();
893 } while (isdigit(LastChar) || LastChar == '.');
894
895 NumVal = strtod(NumStr.c_str(), 0);
896 return tok_number;
897 }
898
899 if (LastChar == '#') {
900 // Comment until end of line.
901 do LastChar = getchar();
902 while (LastChar != EOF &amp;&amp; LastChar != '\n' &amp; LastChar != '\r');
903
904 if (LastChar != EOF)
905 return gettok();
906 }
907
908 // Check for end of file. Don't eat the EOF.
909 if (LastChar == EOF)
910 return tok_eof;
911
912 // Otherwise, just return the character as its ascii value.
913 int ThisChar = LastChar;
914 LastChar = getchar();
915 return ThisChar;
916}
917
918//===----------------------------------------------------------------------===//
919// Abstract Syntax Tree (aka Parse Tree)
920//===----------------------------------------------------------------------===//
921
922/// ExprAST - Base class for all expression nodes.
923class ExprAST {
924public:
925 virtual ~ExprAST() {}
926 virtual Value *Codegen() = 0;
927};
928
929/// NumberExprAST - Expression class for numeric literals like "1.0".
930class NumberExprAST : public ExprAST {
931 double Val;
932public:
933 NumberExprAST(double val) : Val(val) {}
934 virtual Value *Codegen();
935};
936
937/// VariableExprAST - Expression class for referencing a variable, like "a".
938class VariableExprAST : public ExprAST {
939 std::string Name;
940public:
941 VariableExprAST(const std::string &amp;name) : Name(name) {}
942 virtual Value *Codegen();
943};
944
945/// UnaryExprAST - Expression class for a unary operator.
946class UnaryExprAST : public ExprAST {
947 char Opcode;
948 ExprAST *Operand;
949public:
950 UnaryExprAST(char opcode, ExprAST *operand)
951 : Opcode(opcode), Operand(operand) {}
952 virtual Value *Codegen();
953};
954
955/// BinaryExprAST - Expression class for a binary operator.
956class BinaryExprAST : public ExprAST {
957 char Op;
958 ExprAST *LHS, *RHS;
959public:
960 BinaryExprAST(char op, ExprAST *lhs, ExprAST *rhs)
961 : Op(op), LHS(lhs), RHS(rhs) {}
962 virtual Value *Codegen();
963};
964
965/// CallExprAST - Expression class for function calls.
966class CallExprAST : public ExprAST {
967 std::string Callee;
968 std::vector&lt;ExprAST*&gt; Args;
969public:
970 CallExprAST(const std::string &amp;callee, std::vector&lt;ExprAST*&gt; &amp;args)
971 : Callee(callee), Args(args) {}
972 virtual Value *Codegen();
973};
974
975/// IfExprAST - Expression class for if/then/else.
976class IfExprAST : public ExprAST {
977 ExprAST *Cond, *Then, *Else;
978public:
979 IfExprAST(ExprAST *cond, ExprAST *then, ExprAST *_else)
980 : Cond(cond), Then(then), Else(_else) {}
981 virtual Value *Codegen();
982};
983
984/// ForExprAST - Expression class for for/in.
985class ForExprAST : public ExprAST {
986 std::string VarName;
987 ExprAST *Start, *End, *Step, *Body;
988public:
989 ForExprAST(const std::string &amp;varname, ExprAST *start, ExprAST *end,
990 ExprAST *step, ExprAST *body)
991 : VarName(varname), Start(start), End(end), Step(step), Body(body) {}
992 virtual Value *Codegen();
993};
994
995/// PrototypeAST - This class represents the "prototype" for a function,
996/// which captures its argument names as well as if it is an operator.
997class PrototypeAST {
998 std::string Name;
999 std::vector&lt;std::string&gt; Args;
1000 bool isOperator;
1001 unsigned Precedence; // Precedence if a binary op.
1002public:
1003 PrototypeAST(const std::string &amp;name, const std::vector&lt;std::string&gt; &amp;args,
1004 bool isoperator = false, unsigned prec = 0)
1005 : Name(name), Args(args), isOperator(isoperator), Precedence(prec) {}
1006
1007 bool isUnaryOp() const { return isOperator &amp;&amp; Args.size() == 1; }
1008 bool isBinaryOp() const { return isOperator &amp;&amp; Args.size() == 2; }
1009
1010 char getOperatorName() const {
1011 assert(isUnaryOp() || isBinaryOp());
1012 return Name[Name.size()-1];
1013 }
1014
1015 unsigned getBinaryPrecedence() const { return Precedence; }
1016
1017 Function *Codegen();
1018};
1019
1020/// FunctionAST - This class represents a function definition itself.
1021class FunctionAST {
1022 PrototypeAST *Proto;
1023 ExprAST *Body;
1024public:
1025 FunctionAST(PrototypeAST *proto, ExprAST *body)
1026 : Proto(proto), Body(body) {}
1027
1028 Function *Codegen();
1029};
1030
1031//===----------------------------------------------------------------------===//
1032// Parser
1033//===----------------------------------------------------------------------===//
1034
1035/// CurTok/getNextToken - Provide a simple token buffer. CurTok is the current
1036/// token the parser it looking at. getNextToken reads another token from the
1037/// lexer and updates CurTok with its results.
1038static int CurTok;
1039static int getNextToken() {
1040 return CurTok = gettok();
1041}
1042
1043/// BinopPrecedence - This holds the precedence for each binary operator that is
1044/// defined.
1045static std::map&lt;char, int&gt; BinopPrecedence;
1046
1047/// GetTokPrecedence - Get the precedence of the pending binary operator token.
1048static int GetTokPrecedence() {
1049 if (!isascii(CurTok))
1050 return -1;
1051
1052 // Make sure it's a declared binop.
1053 int TokPrec = BinopPrecedence[CurTok];
1054 if (TokPrec &lt;= 0) return -1;
1055 return TokPrec;
1056}
1057
1058/// Error* - These are little helper functions for error handling.
1059ExprAST *Error(const char *Str) { fprintf(stderr, "Error: %s\n", Str);return 0;}
1060PrototypeAST *ErrorP(const char *Str) { Error(Str); return 0; }
1061FunctionAST *ErrorF(const char *Str) { Error(Str); return 0; }
1062
1063static ExprAST *ParseExpression();
1064
1065/// identifierexpr
Chris Lattner20a0c802007-11-05 17:54:34 +00001066/// ::= identifier
1067/// ::= identifier '(' expression* ')'
Chris Lattner58f2c872007-11-02 05:42:52 +00001068static ExprAST *ParseIdentifierExpr() {
1069 std::string IdName = IdentifierStr;
1070
Chris Lattner20a0c802007-11-05 17:54:34 +00001071 getNextToken(); // eat identifier.
Chris Lattner58f2c872007-11-02 05:42:52 +00001072
1073 if (CurTok != '(') // Simple variable ref.
1074 return new VariableExprAST(IdName);
1075
1076 // Call.
1077 getNextToken(); // eat (
1078 std::vector&lt;ExprAST*&gt; Args;
1079 if (CurTok != ')') {
1080 while (1) {
1081 ExprAST *Arg = ParseExpression();
1082 if (!Arg) return 0;
1083 Args.push_back(Arg);
1084
1085 if (CurTok == ')') break;
1086
1087 if (CurTok != ',')
1088 return Error("Expected ')'");
1089 getNextToken();
1090 }
1091 }
1092
1093 // Eat the ')'.
1094 getNextToken();
1095
1096 return new CallExprAST(IdName, Args);
1097}
1098
1099/// numberexpr ::= number
1100static ExprAST *ParseNumberExpr() {
1101 ExprAST *Result = new NumberExprAST(NumVal);
1102 getNextToken(); // consume the number
1103 return Result;
1104}
1105
1106/// parenexpr ::= '(' expression ')'
1107static ExprAST *ParseParenExpr() {
1108 getNextToken(); // eat (.
1109 ExprAST *V = ParseExpression();
1110 if (!V) return 0;
1111
1112 if (CurTok != ')')
1113 return Error("expected ')'");
1114 getNextToken(); // eat ).
1115 return V;
1116}
1117
1118/// ifexpr ::= 'if' expression 'then' expression 'else' expression
1119static ExprAST *ParseIfExpr() {
1120 getNextToken(); // eat the if.
1121
1122 // condition.
1123 ExprAST *Cond = ParseExpression();
1124 if (!Cond) return 0;
1125
1126 if (CurTok != tok_then)
1127 return Error("expected then");
1128 getNextToken(); // eat the then
1129
1130 ExprAST *Then = ParseExpression();
1131 if (Then == 0) return 0;
1132
1133 if (CurTok != tok_else)
1134 return Error("expected else");
1135
1136 getNextToken();
1137
1138 ExprAST *Else = ParseExpression();
1139 if (!Else) return 0;
1140
1141 return new IfExprAST(Cond, Then, Else);
1142}
1143
Chris Lattner20a0c802007-11-05 17:54:34 +00001144/// forexpr ::= 'for' identifier '=' expr ',' expr (',' expr)? 'in' expression
Chris Lattner58f2c872007-11-02 05:42:52 +00001145static ExprAST *ParseForExpr() {
1146 getNextToken(); // eat the for.
1147
1148 if (CurTok != tok_identifier)
1149 return Error("expected identifier after for");
1150
1151 std::string IdName = IdentifierStr;
Chris Lattner20a0c802007-11-05 17:54:34 +00001152 getNextToken(); // eat identifier.
Chris Lattner58f2c872007-11-02 05:42:52 +00001153
1154 if (CurTok != '=')
1155 return Error("expected '=' after for");
1156 getNextToken(); // eat '='.
1157
1158
1159 ExprAST *Start = ParseExpression();
1160 if (Start == 0) return 0;
1161 if (CurTok != ',')
1162 return Error("expected ',' after for start value");
1163 getNextToken();
1164
1165 ExprAST *End = ParseExpression();
1166 if (End == 0) return 0;
1167
1168 // The step value is optional.
1169 ExprAST *Step = 0;
1170 if (CurTok == ',') {
1171 getNextToken();
1172 Step = ParseExpression();
1173 if (Step == 0) return 0;
1174 }
1175
1176 if (CurTok != tok_in)
1177 return Error("expected 'in' after for");
1178 getNextToken(); // eat 'in'.
1179
1180 ExprAST *Body = ParseExpression();
1181 if (Body == 0) return 0;
1182
1183 return new ForExprAST(IdName, Start, End, Step, Body);
1184}
1185
1186
1187/// primary
1188/// ::= identifierexpr
1189/// ::= numberexpr
1190/// ::= parenexpr
1191/// ::= ifexpr
1192/// ::= forexpr
1193static ExprAST *ParsePrimary() {
1194 switch (CurTok) {
1195 default: return Error("unknown token when expecting an expression");
1196 case tok_identifier: return ParseIdentifierExpr();
1197 case tok_number: return ParseNumberExpr();
1198 case '(': return ParseParenExpr();
1199 case tok_if: return ParseIfExpr();
1200 case tok_for: return ParseForExpr();
1201 }
1202}
1203
1204/// unary
1205/// ::= primary
1206/// ::= '!' unary
1207static ExprAST *ParseUnary() {
1208 // If the current token is not an operator, it must be a primary expr.
1209 if (!isascii(CurTok) || CurTok == '(' || CurTok == ',')
1210 return ParsePrimary();
1211
1212 // If this is a unary operator, read it.
1213 int Opc = CurTok;
1214 getNextToken();
1215 if (ExprAST *Operand = ParseUnary())
1216 return new UnaryExprAST(Opc, Operand);
1217 return 0;
1218}
1219
1220/// binoprhs
1221/// ::= ('+' unary)*
1222static ExprAST *ParseBinOpRHS(int ExprPrec, ExprAST *LHS) {
1223 // If this is a binop, find its precedence.
1224 while (1) {
1225 int TokPrec = GetTokPrecedence();
1226
1227 // If this is a binop that binds at least as tightly as the current binop,
1228 // consume it, otherwise we are done.
1229 if (TokPrec &lt; ExprPrec)
1230 return LHS;
1231
1232 // Okay, we know this is a binop.
1233 int BinOp = CurTok;
1234 getNextToken(); // eat binop
1235
1236 // Parse the unary expression after the binary operator.
1237 ExprAST *RHS = ParseUnary();
1238 if (!RHS) return 0;
1239
1240 // If BinOp binds less tightly with RHS than the operator after RHS, let
1241 // the pending operator take RHS as its LHS.
1242 int NextPrec = GetTokPrecedence();
1243 if (TokPrec &lt; NextPrec) {
1244 RHS = ParseBinOpRHS(TokPrec+1, RHS);
1245 if (RHS == 0) return 0;
1246 }
1247
1248 // Merge LHS/RHS.
1249 LHS = new BinaryExprAST(BinOp, LHS, RHS);
1250 }
1251}
1252
1253/// expression
1254/// ::= unary binoprhs
1255///
1256static ExprAST *ParseExpression() {
1257 ExprAST *LHS = ParseUnary();
1258 if (!LHS) return 0;
1259
1260 return ParseBinOpRHS(0, LHS);
1261}
1262
1263/// prototype
1264/// ::= id '(' id* ')'
1265/// ::= binary LETTER number? (id, id)
1266/// ::= unary LETTER (id)
1267static PrototypeAST *ParsePrototype() {
1268 std::string FnName;
1269
1270 int Kind = 0; // 0 = identifier, 1 = unary, 2 = binary.
1271 unsigned BinaryPrecedence = 30;
1272
1273 switch (CurTok) {
1274 default:
1275 return ErrorP("Expected function name in prototype");
1276 case tok_identifier:
1277 FnName = IdentifierStr;
1278 Kind = 0;
1279 getNextToken();
1280 break;
1281 case tok_unary:
1282 getNextToken();
1283 if (!isascii(CurTok))
1284 return ErrorP("Expected unary operator");
1285 FnName = "unary";
1286 FnName += (char)CurTok;
1287 Kind = 1;
1288 getNextToken();
1289 break;
1290 case tok_binary:
1291 getNextToken();
1292 if (!isascii(CurTok))
1293 return ErrorP("Expected binary operator");
1294 FnName = "binary";
1295 FnName += (char)CurTok;
1296 Kind = 2;
1297 getNextToken();
1298
1299 // Read the precedence if present.
1300 if (CurTok == tok_number) {
1301 if (NumVal &lt; 1 || NumVal &gt; 100)
1302 return ErrorP("Invalid precedecnce: must be 1..100");
1303 BinaryPrecedence = (unsigned)NumVal;
1304 getNextToken();
1305 }
1306 break;
1307 }
1308
1309 if (CurTok != '(')
1310 return ErrorP("Expected '(' in prototype");
1311
1312 std::vector&lt;std::string&gt; ArgNames;
1313 while (getNextToken() == tok_identifier)
1314 ArgNames.push_back(IdentifierStr);
1315 if (CurTok != ')')
1316 return ErrorP("Expected ')' in prototype");
1317
1318 // success.
1319 getNextToken(); // eat ')'.
1320
1321 // Verify right number of names for operator.
1322 if (Kind &amp;&amp; ArgNames.size() != Kind)
1323 return ErrorP("Invalid number of operands for operator");
1324
1325 return new PrototypeAST(FnName, ArgNames, Kind != 0, BinaryPrecedence);
1326}
1327
1328/// definition ::= 'def' prototype expression
1329static FunctionAST *ParseDefinition() {
1330 getNextToken(); // eat def.
1331 PrototypeAST *Proto = ParsePrototype();
1332 if (Proto == 0) return 0;
1333
1334 if (ExprAST *E = ParseExpression())
1335 return new FunctionAST(Proto, E);
1336 return 0;
1337}
1338
1339/// toplevelexpr ::= expression
1340static FunctionAST *ParseTopLevelExpr() {
1341 if (ExprAST *E = ParseExpression()) {
1342 // Make an anonymous proto.
1343 PrototypeAST *Proto = new PrototypeAST("", std::vector&lt;std::string&gt;());
1344 return new FunctionAST(Proto, E);
1345 }
1346 return 0;
1347}
1348
1349/// external ::= 'extern' prototype
1350static PrototypeAST *ParseExtern() {
1351 getNextToken(); // eat extern.
1352 return ParsePrototype();
1353}
1354
1355//===----------------------------------------------------------------------===//
1356// Code Generation
1357//===----------------------------------------------------------------------===//
1358
1359static Module *TheModule;
1360static LLVMFoldingBuilder Builder;
1361static std::map&lt;std::string, Value*&gt; NamedValues;
1362static FunctionPassManager *TheFPM;
1363
1364Value *ErrorV(const char *Str) { Error(Str); return 0; }
1365
1366Value *NumberExprAST::Codegen() {
1367 return ConstantFP::get(Type::DoubleTy, APFloat(Val));
1368}
1369
1370Value *VariableExprAST::Codegen() {
1371 // Look this variable up in the function.
1372 Value *V = NamedValues[Name];
1373 return V ? V : ErrorV("Unknown variable name");
1374}
1375
1376Value *UnaryExprAST::Codegen() {
1377 Value *OperandV = Operand-&gt;Codegen();
1378 if (OperandV == 0) return 0;
1379
1380 Function *F = TheModule-&gt;getFunction(std::string("unary")+Opcode);
1381 if (F == 0)
1382 return ErrorV("Unknown unary operator");
1383
1384 return Builder.CreateCall(F, OperandV, "unop");
1385}
1386
1387
1388Value *BinaryExprAST::Codegen() {
1389 Value *L = LHS-&gt;Codegen();
1390 Value *R = RHS-&gt;Codegen();
1391 if (L == 0 || R == 0) return 0;
1392
1393 switch (Op) {
1394 case '+': return Builder.CreateAdd(L, R, "addtmp");
1395 case '-': return Builder.CreateSub(L, R, "subtmp");
1396 case '*': return Builder.CreateMul(L, R, "multmp");
1397 case '&lt;':
Chris Lattner71155212007-11-06 01:39:12 +00001398 L = Builder.CreateFCmpULT(L, R, "cmptmp");
Chris Lattner58f2c872007-11-02 05:42:52 +00001399 // Convert bool 0/1 to double 0.0 or 1.0
1400 return Builder.CreateUIToFP(L, Type::DoubleTy, "booltmp");
1401 default: break;
1402 }
1403
1404 // If it wasn't a builtin binary operator, it must be a user defined one. Emit
1405 // a call to it.
1406 Function *F = TheModule-&gt;getFunction(std::string("binary")+Op);
1407 assert(F &amp;&amp; "binary operator not found!");
1408
1409 Value *Ops[] = { L, R };
1410 return Builder.CreateCall(F, Ops, Ops+2, "binop");
1411}
1412
1413Value *CallExprAST::Codegen() {
1414 // Look up the name in the global module table.
1415 Function *CalleeF = TheModule-&gt;getFunction(Callee);
1416 if (CalleeF == 0)
1417 return ErrorV("Unknown function referenced");
1418
1419 // If argument mismatch error.
1420 if (CalleeF-&gt;arg_size() != Args.size())
1421 return ErrorV("Incorrect # arguments passed");
1422
1423 std::vector&lt;Value*&gt; ArgsV;
1424 for (unsigned i = 0, e = Args.size(); i != e; ++i) {
1425 ArgsV.push_back(Args[i]-&gt;Codegen());
1426 if (ArgsV.back() == 0) return 0;
1427 }
1428
1429 return Builder.CreateCall(CalleeF, ArgsV.begin(), ArgsV.end(), "calltmp");
1430}
1431
1432Value *IfExprAST::Codegen() {
1433 Value *CondV = Cond-&gt;Codegen();
1434 if (CondV == 0) return 0;
1435
1436 // Convert condition to a bool by comparing equal to 0.0.
1437 CondV = Builder.CreateFCmpONE(CondV,
1438 ConstantFP::get(Type::DoubleTy, APFloat(0.0)),
1439 "ifcond");
1440
1441 Function *TheFunction = Builder.GetInsertBlock()-&gt;getParent();
1442
1443 // Create blocks for the then and else cases. Insert the 'then' block at the
1444 // end of the function.
1445 BasicBlock *ThenBB = new BasicBlock("then", TheFunction);
1446 BasicBlock *ElseBB = new BasicBlock("else");
1447 BasicBlock *MergeBB = new BasicBlock("ifcont");
1448
1449 Builder.CreateCondBr(CondV, ThenBB, ElseBB);
1450
1451 // Emit then value.
1452 Builder.SetInsertPoint(ThenBB);
1453
1454 Value *ThenV = Then-&gt;Codegen();
1455 if (ThenV == 0) return 0;
1456
1457 Builder.CreateBr(MergeBB);
1458 // Codegen of 'Then' can change the current block, update ThenBB for the PHI.
1459 ThenBB = Builder.GetInsertBlock();
1460
1461 // Emit else block.
1462 TheFunction-&gt;getBasicBlockList().push_back(ElseBB);
1463 Builder.SetInsertPoint(ElseBB);
1464
1465 Value *ElseV = Else-&gt;Codegen();
1466 if (ElseV == 0) return 0;
1467
1468 Builder.CreateBr(MergeBB);
1469 // Codegen of 'Else' can change the current block, update ElseBB for the PHI.
1470 ElseBB = Builder.GetInsertBlock();
1471
1472 // Emit merge block.
1473 TheFunction-&gt;getBasicBlockList().push_back(MergeBB);
1474 Builder.SetInsertPoint(MergeBB);
1475 PHINode *PN = Builder.CreatePHI(Type::DoubleTy, "iftmp");
1476
1477 PN-&gt;addIncoming(ThenV, ThenBB);
1478 PN-&gt;addIncoming(ElseV, ElseBB);
1479 return PN;
1480}
1481
1482Value *ForExprAST::Codegen() {
1483 // Output this as:
1484 // ...
1485 // start = startexpr
1486 // goto loop
1487 // loop:
1488 // variable = phi [start, loopheader], [nextvariable, loopend]
1489 // ...
1490 // bodyexpr
1491 // ...
1492 // loopend:
1493 // step = stepexpr
1494 // nextvariable = variable + step
1495 // endcond = endexpr
1496 // br endcond, loop, endloop
1497 // outloop:
1498
1499 // Emit the start code first, without 'variable' in scope.
1500 Value *StartVal = Start-&gt;Codegen();
1501 if (StartVal == 0) return 0;
1502
1503 // Make the new basic block for the loop header, inserting after current
1504 // block.
1505 Function *TheFunction = Builder.GetInsertBlock()-&gt;getParent();
1506 BasicBlock *PreheaderBB = Builder.GetInsertBlock();
1507 BasicBlock *LoopBB = new BasicBlock("loop", TheFunction);
1508
1509 // Insert an explicit fall through from the current block to the LoopBB.
1510 Builder.CreateBr(LoopBB);
1511
1512 // Start insertion in LoopBB.
1513 Builder.SetInsertPoint(LoopBB);
1514
1515 // Start the PHI node with an entry for Start.
1516 PHINode *Variable = Builder.CreatePHI(Type::DoubleTy, VarName.c_str());
1517 Variable-&gt;addIncoming(StartVal, PreheaderBB);
1518
1519 // Within the loop, the variable is defined equal to the PHI node. If it
1520 // shadows an existing variable, we have to restore it, so save it now.
1521 Value *OldVal = NamedValues[VarName];
1522 NamedValues[VarName] = Variable;
1523
1524 // Emit the body of the loop. This, like any other expr, can change the
1525 // current BB. Note that we ignore the value computed by the body, but don't
1526 // allow an error.
1527 if (Body-&gt;Codegen() == 0)
1528 return 0;
1529
1530 // Emit the step value.
1531 Value *StepVal;
1532 if (Step) {
1533 StepVal = Step-&gt;Codegen();
1534 if (StepVal == 0) return 0;
1535 } else {
1536 // If not specified, use 1.0.
1537 StepVal = ConstantFP::get(Type::DoubleTy, APFloat(1.0));
1538 }
1539
1540 Value *NextVar = Builder.CreateAdd(Variable, StepVal, "nextvar");
1541
1542 // Compute the end condition.
1543 Value *EndCond = End-&gt;Codegen();
1544 if (EndCond == 0) return EndCond;
1545
1546 // Convert condition to a bool by comparing equal to 0.0.
1547 EndCond = Builder.CreateFCmpONE(EndCond,
1548 ConstantFP::get(Type::DoubleTy, APFloat(0.0)),
1549 "loopcond");
1550
1551 // Create the "after loop" block and insert it.
1552 BasicBlock *LoopEndBB = Builder.GetInsertBlock();
1553 BasicBlock *AfterBB = new BasicBlock("afterloop", TheFunction);
1554
1555 // Insert the conditional branch into the end of LoopEndBB.
1556 Builder.CreateCondBr(EndCond, LoopBB, AfterBB);
1557
1558 // Any new code will be inserted in AfterBB.
1559 Builder.SetInsertPoint(AfterBB);
1560
1561 // Add a new entry to the PHI node for the backedge.
1562 Variable-&gt;addIncoming(NextVar, LoopEndBB);
1563
1564 // Restore the unshadowed variable.
1565 if (OldVal)
1566 NamedValues[VarName] = OldVal;
1567 else
1568 NamedValues.erase(VarName);
1569
1570
1571 // for expr always returns 0.0.
1572 return Constant::getNullValue(Type::DoubleTy);
1573}
1574
1575Function *PrototypeAST::Codegen() {
1576 // Make the function type: double(double,double) etc.
1577 std::vector&lt;const Type*&gt; Doubles(Args.size(), Type::DoubleTy);
1578 FunctionType *FT = FunctionType::get(Type::DoubleTy, Doubles, false);
1579
1580 Function *F = new Function(FT, Function::ExternalLinkage, Name, TheModule);
1581
1582 // If F conflicted, there was already something named 'Name'. If it has a
1583 // body, don't allow redefinition or reextern.
1584 if (F-&gt;getName() != Name) {
1585 // Delete the one we just made and get the existing one.
1586 F-&gt;eraseFromParent();
1587 F = TheModule-&gt;getFunction(Name);
1588
1589 // If F already has a body, reject this.
1590 if (!F-&gt;empty()) {
1591 ErrorF("redefinition of function");
1592 return 0;
1593 }
1594
1595 // If F took a different number of args, reject.
1596 if (F-&gt;arg_size() != Args.size()) {
1597 ErrorF("redefinition of function with different # args");
1598 return 0;
1599 }
1600 }
1601
1602 // Set names for all arguments.
1603 unsigned Idx = 0;
1604 for (Function::arg_iterator AI = F-&gt;arg_begin(); Idx != Args.size();
1605 ++AI, ++Idx) {
1606 AI-&gt;setName(Args[Idx]);
1607
1608 // Add arguments to variable symbol table.
1609 NamedValues[Args[Idx]] = AI;
1610 }
1611
1612 return F;
1613}
1614
1615Function *FunctionAST::Codegen() {
1616 NamedValues.clear();
1617
1618 Function *TheFunction = Proto-&gt;Codegen();
1619 if (TheFunction == 0)
1620 return 0;
1621
1622 // If this is an operator, install it.
1623 if (Proto-&gt;isBinaryOp())
1624 BinopPrecedence[Proto-&gt;getOperatorName()] = Proto-&gt;getBinaryPrecedence();
1625
1626 // Create a new basic block to start insertion into.
1627 BasicBlock *BB = new BasicBlock("entry", TheFunction);
1628 Builder.SetInsertPoint(BB);
1629
1630 if (Value *RetVal = Body-&gt;Codegen()) {
1631 // Finish off the function.
1632 Builder.CreateRet(RetVal);
1633
1634 // Validate the generated code, checking for consistency.
1635 verifyFunction(*TheFunction);
1636
1637 // Optimize the function.
1638 TheFPM-&gt;run(*TheFunction);
1639
1640 return TheFunction;
1641 }
1642
1643 // Error reading body, remove function.
1644 TheFunction-&gt;eraseFromParent();
1645
1646 if (Proto-&gt;isBinaryOp())
1647 BinopPrecedence.erase(Proto-&gt;getOperatorName());
1648 return 0;
1649}
1650
1651//===----------------------------------------------------------------------===//
1652// Top-Level parsing and JIT Driver
1653//===----------------------------------------------------------------------===//
1654
1655static ExecutionEngine *TheExecutionEngine;
1656
1657static void HandleDefinition() {
1658 if (FunctionAST *F = ParseDefinition()) {
1659 if (Function *LF = F-&gt;Codegen()) {
1660 fprintf(stderr, "Read function definition:");
1661 LF-&gt;dump();
1662 }
1663 } else {
1664 // Skip token for error recovery.
1665 getNextToken();
1666 }
1667}
1668
1669static void HandleExtern() {
1670 if (PrototypeAST *P = ParseExtern()) {
1671 if (Function *F = P-&gt;Codegen()) {
1672 fprintf(stderr, "Read extern: ");
1673 F-&gt;dump();
1674 }
1675 } else {
1676 // Skip token for error recovery.
1677 getNextToken();
1678 }
1679}
1680
1681static void HandleTopLevelExpression() {
1682 // Evaluate a top level expression into an anonymous function.
1683 if (FunctionAST *F = ParseTopLevelExpr()) {
1684 if (Function *LF = F-&gt;Codegen()) {
1685 // JIT the function, returning a function pointer.
1686 void *FPtr = TheExecutionEngine-&gt;getPointerToFunction(LF);
1687
1688 // Cast it to the right type (takes no arguments, returns a double) so we
1689 // can call it as a native function.
1690 double (*FP)() = (double (*)())FPtr;
1691 fprintf(stderr, "Evaluated to %f\n", FP());
1692 }
1693 } else {
1694 // Skip token for error recovery.
1695 getNextToken();
1696 }
1697}
1698
1699/// top ::= definition | external | expression | ';'
1700static void MainLoop() {
1701 while (1) {
1702 fprintf(stderr, "ready&gt; ");
1703 switch (CurTok) {
1704 case tok_eof: return;
1705 case ';': getNextToken(); break; // ignore top level semicolons.
1706 case tok_def: HandleDefinition(); break;
1707 case tok_extern: HandleExtern(); break;
1708 default: HandleTopLevelExpression(); break;
1709 }
1710 }
1711}
1712
1713
1714
1715//===----------------------------------------------------------------------===//
1716// "Library" functions that can be "extern'd" from user code.
1717//===----------------------------------------------------------------------===//
1718
1719/// putchard - putchar that takes a double and returns 0.
1720extern "C"
1721double putchard(double X) {
1722 putchar((char)X);
1723 return 0;
1724}
1725
1726/// printd - printf that takes a double prints it as "%f\n", returning 0.
1727extern "C"
1728double printd(double X) {
1729 printf("%f\n", X);
1730 return 0;
1731}
1732
1733//===----------------------------------------------------------------------===//
1734// Main driver code.
1735//===----------------------------------------------------------------------===//
1736
1737int main() {
1738 // Install standard binary operators.
1739 // 1 is lowest precedence.
1740 BinopPrecedence['&lt;'] = 10;
1741 BinopPrecedence['+'] = 20;
1742 BinopPrecedence['-'] = 20;
1743 BinopPrecedence['*'] = 40; // highest.
1744
1745 // Prime the first token.
1746 fprintf(stderr, "ready&gt; ");
1747 getNextToken();
1748
1749 // Make the module, which holds all the code.
1750 TheModule = new Module("my cool jit");
1751
1752 // Create the JIT.
1753 TheExecutionEngine = ExecutionEngine::create(TheModule);
1754
1755 {
1756 ExistingModuleProvider OurModuleProvider(TheModule);
1757 FunctionPassManager OurFPM(&amp;OurModuleProvider);
1758
1759 // Set up the optimizer pipeline. Start with registering info about how the
1760 // target lays out data structures.
1761 OurFPM.add(new TargetData(*TheExecutionEngine-&gt;getTargetData()));
1762 // Do simple "peephole" optimizations and bit-twiddling optzns.
1763 OurFPM.add(createInstructionCombiningPass());
1764 // Reassociate expressions.
1765 OurFPM.add(createReassociatePass());
1766 // Eliminate Common SubExpressions.
1767 OurFPM.add(createGVNPass());
1768 // Simplify the control flow graph (deleting unreachable blocks, etc).
1769 OurFPM.add(createCFGSimplificationPass());
1770 // Set the global so the code gen can use this.
1771 TheFPM = &amp;OurFPM;
1772
1773 // Run the main "interpreter loop" now.
1774 MainLoop();
1775
1776 TheFPM = 0;
1777 } // Free module provider and pass manager.
1778
1779
1780 // Print out all of the generated code.
1781 TheModule->dump();
1782 return 0;
1783}
Chris Lattnerc9d5d2c2007-11-01 06:49:54 +00001784</pre>
1785</div>
1786
1787</div>
1788
1789<!-- *********************************************************************** -->
1790<hr>
1791<address>
1792 <a href="http://jigsaw.w3.org/css-validator/check/referer"><img
1793 src="http://jigsaw.w3.org/css-validator/images/vcss" alt="Valid CSS!"></a>
1794 <a href="http://validator.w3.org/check/referer"><img
1795 src="http://www.w3.org/Icons/valid-html401" alt="Valid HTML 4.01!"></a>
1796
1797 <a href="mailto:sabre@nondot.org">Chris Lattner</a><br>
1798 <a href="http://llvm.org">The LLVM Compiler Infrastructure</a><br>
1799 Last modified: $Date: 2007-10-17 11:05:13 -0700 (Wed, 17 Oct 2007) $
1800</address>
1801</body>
1802</html>