blob: 7f4b76a554ee1ecab87e4905840da84a20c9662a [file] [log] [blame]
Chris Lattner2e902042007-10-22 07:01:42 +00001<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01//EN"
2 "http://www.w3.org/TR/html4/strict.dtd">
3
4<html>
5<head>
6 <title>Kaleidoscope: Implementing code generation to LLVM IR</title>
7 <meta http-equiv="Content-Type" content="text/html; charset=utf-8">
8 <meta name="author" content="Chris Lattner">
9 <link rel="stylesheet" href="../llvm.css" type="text/css">
10</head>
11
12<body>
13
14<div class="doc_title">Kaleidoscope: Code generation to LLVM IR</div>
15
Chris Lattner128eb862007-11-05 19:06:59 +000016<ul>
Chris Lattner0e555b12007-11-05 20:04:56 +000017<li><a href="index.html">Up to Tutorial Index</a></li>
Chris Lattner128eb862007-11-05 19:06:59 +000018<li>Chapter 3
19 <ol>
20 <li><a href="#intro">Chapter 3 Introduction</a></li>
Chris Lattner7badb2d2007-11-06 07:26:32 +000021 <li><a href="#basics">Code Generation Setup</a></li>
Chris Lattner128eb862007-11-05 19:06:59 +000022 <li><a href="#exprs">Expression Code Generation</a></li>
23 <li><a href="#funcs">Function Code Generation</a></li>
24 <li><a href="#driver">Driver Changes and Closing Thoughts</a></li>
25 <li><a href="#code">Full Code Listing</a></li>
26 </ol>
27</li>
Chris Lattner0e555b12007-11-05 20:04:56 +000028<li><a href="LangImpl4.html">Chapter 4</a>: Adding JIT and Optimizer
29Support</li>
Chris Lattner128eb862007-11-05 19:06:59 +000030</ul>
31
Chris Lattner2e902042007-10-22 07:01:42 +000032<div class="doc_author">
33 <p>Written by <a href="mailto:sabre@nondot.org">Chris Lattner</a></p>
34</div>
35
36<!-- *********************************************************************** -->
Chris Lattner128eb862007-11-05 19:06:59 +000037<div class="doc_section"><a name="intro">Chapter 3 Introduction</a></div>
Chris Lattner2e902042007-10-22 07:01:42 +000038<!-- *********************************************************************** -->
39
40<div class="doc_text">
41
Chris Lattner128eb862007-11-05 19:06:59 +000042<p>Welcome to Chapter 3 of the "<a href="index.html">Implementing a language
43with LLVM</a>" tutorial. This chapter shows you how to transform the <a
Chris Lattner41fcea32007-11-13 07:06:30 +000044href="LangImpl2.html">Abstract Syntax Tree</a>, built in Chapter 2, into LLVM IR.
Chris Lattner2e902042007-10-22 07:01:42 +000045This will teach you a little bit about how LLVM does things, as well as
46demonstrate how easy it is to use. It's much more work to build a lexer and
Chris Lattner7badb2d2007-11-06 07:26:32 +000047parser than it is to generate LLVM IR code. :)
Chris Lattner2e902042007-10-22 07:01:42 +000048</p>
49
Chris Lattner51c97172007-11-28 19:26:42 +000050<p><b>Please note</b>: the code in this chapter and later require LLVM 2.2 or
51LLVM SVN to work. LLVM 2.1 and before will not work with it.</p>
52
Chris Lattner2e902042007-10-22 07:01:42 +000053</div>
54
55<!-- *********************************************************************** -->
Chris Lattner7badb2d2007-11-06 07:26:32 +000056<div class="doc_section"><a name="basics">Code Generation Setup</a></div>
Chris Lattner2e902042007-10-22 07:01:42 +000057<!-- *********************************************************************** -->
58
59<div class="doc_text">
60
61<p>
Chris Lattner729eb142008-02-10 19:11:04 +000062In order to generate LLVM IR, we want some simple setup to get started. First
63we define virtual code generation (codegen) methods in each AST class:</p>
Chris Lattner2e902042007-10-22 07:01:42 +000064
65<div class="doc_code">
66<pre>
67/// ExprAST - Base class for all expression nodes.
68class ExprAST {
69public:
70 virtual ~ExprAST() {}
Chris Lattnercac21352007-11-05 19:25:14 +000071 <b>virtual Value *Codegen() = 0;</b>
Chris Lattner2e902042007-10-22 07:01:42 +000072};
73
74/// NumberExprAST - Expression class for numeric literals like "1.0".
75class NumberExprAST : public ExprAST {
76 double Val;
77public:
Chris Lattner28571ed2007-10-23 04:27:44 +000078 explicit NumberExprAST(double val) : Val(val) {}
Chris Lattnercac21352007-11-05 19:25:14 +000079 <b>virtual Value *Codegen();</b>
Chris Lattner2e902042007-10-22 07:01:42 +000080};
81...
82</pre>
83</div>
84
Chris Lattner41fcea32007-11-13 07:06:30 +000085<p>The Codegen() method says to emit IR for that AST node along with all the things it
Chris Lattner28571ed2007-10-23 04:27:44 +000086depends on, and they all return an LLVM Value object.
87"Value" is the class used to represent a "<a
88href="http://en.wikipedia.org/wiki/Static_single_assignment_form">Static Single
89Assignment (SSA)</a> register" or "SSA value" in LLVM. The most distinct aspect
90of SSA values is that their value is computed as the related instruction
91executes, and it does not get a new value until (and if) the instruction
Chris Lattner41fcea32007-11-13 07:06:30 +000092re-executes. In other words, there is no way to "change" an SSA value. For
Chris Lattner28571ed2007-10-23 04:27:44 +000093more information, please read up on <a
94href="http://en.wikipedia.org/wiki/Static_single_assignment_form">Static Single
95Assignment</a> - the concepts are really quite natural once you grok them.</p>
96
Chris Lattnercac21352007-11-05 19:25:14 +000097<p>Note that instead of adding virtual methods to the ExprAST class hierarchy,
Chris Lattner729eb142008-02-10 19:11:04 +000098it could also make sense to use a <a
99href="http://en.wikipedia.org/wiki/Visitor_pattern">visitor pattern</a> or some
100other way to model this. Again, this tutorial won't dwell on good software
101engineering practices: for our purposes, adding a virtual method is
102simplest.</p>
Chris Lattnercac21352007-11-05 19:25:14 +0000103
Chris Lattner28571ed2007-10-23 04:27:44 +0000104<p>The
Chris Lattner41fcea32007-11-13 07:06:30 +0000105second thing we want is an "Error" method like we used for the parser, which will
Chris Lattner2e902042007-10-22 07:01:42 +0000106be used to report errors found during code generation (for example, use of an
107undeclared parameter):</p>
108
109<div class="doc_code">
110<pre>
111Value *ErrorV(const char *Str) { Error(Str); return 0; }
112
113static Module *TheModule;
Duncan Sands89f6d882008-04-13 06:22:09 +0000114static IRBuilder Builder;
Chris Lattner2e902042007-10-22 07:01:42 +0000115static std::map&lt;std::string, Value*&gt; NamedValues;
116</pre>
117</div>
118
119<p>The static variables will be used during code generation. <tt>TheModule</tt>
120is the LLVM construct that contains all of the functions and global variables in
121a chunk of code. In many ways, it is the top-level structure that the LLVM IR
122uses to contain code.</p>
123
124<p>The <tt>Builder</tt> object is a helper object that makes it easy to generate
Chris Lattnerf6e53df2007-11-05 18:02:15 +0000125LLVM instructions. Instances of the <a
Duncan Sands89f6d882008-04-13 06:22:09 +0000126href="http://llvm.org/doxygen/IRBuilder_8h-source.html"><tt>IRBuilder</tt></a>
Chris Lattner729eb142008-02-10 19:11:04 +0000127class keep track of the current place to insert instructions and has methods to
128create new instructions.</p>
Chris Lattner2e902042007-10-22 07:01:42 +0000129
130<p>The <tt>NamedValues</tt> map keeps track of which values are defined in the
Chris Lattner729eb142008-02-10 19:11:04 +0000131current scope and what their LLVM representation is. (In other words, it is a
132symbol table for the code). In this form of Kaleidoscope, the only things that
133can be referenced are function parameters. As such, function parameters will
134be in this map when generating code for their function body.</p>
Chris Lattner2e902042007-10-22 07:01:42 +0000135
136<p>
137With these basics in place, we can start talking about how to generate code for
138each expression. Note that this assumes that the <tt>Builder</tt> has been set
139up to generate code <em>into</em> something. For now, we'll assume that this
140has already been done, and we'll just use it to emit code.
141</p>
142
143</div>
144
145<!-- *********************************************************************** -->
146<div class="doc_section"><a name="exprs">Expression Code Generation</a></div>
147<!-- *********************************************************************** -->
148
149<div class="doc_text">
150
Chris Lattner41fcea32007-11-13 07:06:30 +0000151<p>Generating LLVM code for expression nodes is very straightforward: less
Chris Lattner729eb142008-02-10 19:11:04 +0000152than 45 lines of commented code for all four of our expression nodes. First
Chris Lattner2e902042007-10-22 07:01:42 +0000153we'll do numeric literals:</p>
154
155<div class="doc_code">
156<pre>
157Value *NumberExprAST::Codegen() {
158 return ConstantFP::get(Type::DoubleTy, APFloat(Val));
159}
160</pre>
161</div>
162
Chris Lattnerd3f0cdd2007-10-23 04:51:30 +0000163<p>In the LLVM IR, numeric constants are represented with the
164<tt>ConstantFP</tt> class, which holds the numeric value in an <tt>APFloat</tt>
165internally (<tt>APFloat</tt> has the capability of holding floating point
166constants of <em>A</em>rbitrary <em>P</em>recision). This code basically just
167creates and returns a <tt>ConstantFP</tt>. Note that in the LLVM IR
Chris Lattner2e902042007-10-22 07:01:42 +0000168that constants are all uniqued together and shared. For this reason, the API
Chris Lattner7badb2d2007-11-06 07:26:32 +0000169uses "the foo::get(..)" idiom instead of "new foo(..)" or "foo::create(..)".</p>
Chris Lattner2e902042007-10-22 07:01:42 +0000170
171<div class="doc_code">
172<pre>
173Value *VariableExprAST::Codegen() {
174 // Look this variable up in the function.
175 Value *V = NamedValues[Name];
176 return V ? V : ErrorV("Unknown variable name");
177}
178</pre>
179</div>
180
Chris Lattner41fcea32007-11-13 07:06:30 +0000181<p>References to variables are also quite simple using LLVM. In the simple version
Chris Lattnerd3f0cdd2007-10-23 04:51:30 +0000182of Kaleidoscope, we assume that the variable has already been emited somewhere
183and its value is available. In practice, the only values that can be in the
184<tt>NamedValues</tt> map are function arguments. This
Chris Lattner2e902042007-10-22 07:01:42 +0000185code simply checks to see that the specified name is in the map (if not, an
Chris Lattner7badb2d2007-11-06 07:26:32 +0000186unknown variable is being referenced) and returns the value for it. In future
187chapters, we'll add support for <a href="LangImpl5.html#for">loop induction
188variables</a> in the symbol table, and for <a
189href="LangImpl7.html#localvars">local variables</a>.</p>
Chris Lattner2e902042007-10-22 07:01:42 +0000190
191<div class="doc_code">
192<pre>
193Value *BinaryExprAST::Codegen() {
194 Value *L = LHS-&gt;Codegen();
195 Value *R = RHS-&gt;Codegen();
196 if (L == 0 || R == 0) return 0;
197
198 switch (Op) {
199 case '+': return Builder.CreateAdd(L, R, "addtmp");
200 case '-': return Builder.CreateSub(L, R, "subtmp");
201 case '*': return Builder.CreateMul(L, R, "multmp");
202 case '&lt;':
Chris Lattner71155212007-11-06 01:39:12 +0000203 L = Builder.CreateFCmpULT(L, R, "cmptmp");
Chris Lattner2e902042007-10-22 07:01:42 +0000204 // Convert bool 0/1 to double 0.0 or 1.0
205 return Builder.CreateUIToFP(L, Type::DoubleTy, "booltmp");
206 default: return ErrorV("invalid binary operator");
207 }
208}
209</pre>
210</div>
211
Chris Lattnerd3f0cdd2007-10-23 04:51:30 +0000212<p>Binary operators start to get more interesting. The basic idea here is that
213we recursively emit code for the left-hand side of the expression, then the
214right-hand side, then we compute the result of the binary expression. In this
215code, we do a simple switch on the opcode to create the right LLVM instruction.
216</p>
Chris Lattner2e902042007-10-22 07:01:42 +0000217
Chris Lattner41fcea32007-11-13 07:06:30 +0000218<p>In the example above, the LLVM builder class is starting to show its value.
Duncan Sands89f6d882008-04-13 06:22:09 +0000219IRBuilder knows where to insert the newly created instruction, all you have to
Chris Lattner41fcea32007-11-13 07:06:30 +0000220do is specify what instruction to create (e.g. with <tt>CreateAdd</tt>), which
Chris Lattnerd3f0cdd2007-10-23 04:51:30 +0000221operands to use (<tt>L</tt> and <tt>R</tt> here) and optionally provide a name
Chris Lattner729eb142008-02-10 19:11:04 +0000222for the generated instruction.</p>
223
224<p>One nice thing about LLVM is that the name is just a hint. For instance, if
225the code above emits multiple "addtmp" variables, LLVM will automatically
226provide each one with an increasing, unique numeric suffix. Local value names
227for instructions are purely optional, but it makes it much easier to read the
228IR dumps.</p>
Chris Lattnerd3f0cdd2007-10-23 04:51:30 +0000229
Chris Lattner41fcea32007-11-13 07:06:30 +0000230<p><a href="../LangRef.html#instref">LLVM instructions</a> are constrained by
Chris Lattner7badb2d2007-11-06 07:26:32 +0000231strict rules: for example, the Left and Right operators of
Chris Lattner41fcea32007-11-13 07:06:30 +0000232an <a href="../LangRef.html#i_add">add instruction</a> must have the same
233type, and the result type of the add must match the operand types. Because
Chris Lattner7badb2d2007-11-06 07:26:32 +0000234all values in Kaleidoscope are doubles, this makes for very simple code for add,
235sub and mul.</p>
Chris Lattnerd3f0cdd2007-10-23 04:51:30 +0000236
237<p>On the other hand, LLVM specifies that the <a
238href="../LangRef.html#i_fcmp">fcmp instruction</a> always returns an 'i1' value
Chris Lattner41fcea32007-11-13 07:06:30 +0000239(a one bit integer). The problem with this is that Kaleidoscope wants the value to be a 0.0 or 1.0 value. In order to get these semantics, we combine the fcmp instruction with
Chris Lattnerd3f0cdd2007-10-23 04:51:30 +0000240a <a href="../LangRef.html#i_uitofp">uitofp instruction</a>. This instruction
241converts its input integer into a floating point value by treating the input
242as an unsigned value. In contrast, if we used the <a
Chris Lattnerd96b1592007-11-07 05:07:10 +0000243href="../LangRef.html#i_sitofp">sitofp instruction</a>, the Kaleidoscope '&lt;'
Chris Lattnerd3f0cdd2007-10-23 04:51:30 +0000244operator would return 0.0 and -1.0, depending on the input value.</p>
Chris Lattner2e902042007-10-22 07:01:42 +0000245
246<div class="doc_code">
247<pre>
248Value *CallExprAST::Codegen() {
249 // Look up the name in the global module table.
250 Function *CalleeF = TheModule-&gt;getFunction(Callee);
251 if (CalleeF == 0)
252 return ErrorV("Unknown function referenced");
253
254 // If argument mismatch error.
255 if (CalleeF-&gt;arg_size() != Args.size())
256 return ErrorV("Incorrect # arguments passed");
257
258 std::vector&lt;Value*&gt; ArgsV;
259 for (unsigned i = 0, e = Args.size(); i != e; ++i) {
260 ArgsV.push_back(Args[i]-&gt;Codegen());
261 if (ArgsV.back() == 0) return 0;
262 }
263
264 return Builder.CreateCall(CalleeF, ArgsV.begin(), ArgsV.end(), "calltmp");
265}
266</pre>
267</div>
268
Chris Lattner41fcea32007-11-13 07:06:30 +0000269<p>Code generation for function calls is quite straightforward with LLVM. The
270code above initially does a function name lookup in the LLVM Module's symbol
Chris Lattnerd3f0cdd2007-10-23 04:51:30 +0000271table. Recall that the LLVM Module is the container that holds all of the
272functions we are JIT'ing. By giving each function the same name as what the
273user specifies, we can use the LLVM symbol table to resolve function names for
274us.</p>
275
276<p>Once we have the function to call, we recursively codegen each argument that
277is to be passed in, and create an LLVM <a href="../LangRef.html#i_call">call
278instruction</a>. Note that LLVM uses the native C calling conventions by
Chris Lattner41fcea32007-11-13 07:06:30 +0000279default, allowing these calls to also call into standard library functions like
280"sin" and "cos", with no additional effort.</p>
Chris Lattnerd3f0cdd2007-10-23 04:51:30 +0000281
282<p>This wraps up our handling of the four basic expressions that we have so far
283in Kaleidoscope. Feel free to go in and add some more. For example, by
284browsing the <a href="../LangRef.html">LLVM language reference</a> you'll find
285several other interesting instructions that are really easy to plug into our
286basic framework.</p>
Chris Lattner2e902042007-10-22 07:01:42 +0000287
288</div>
289
290<!-- *********************************************************************** -->
Chris Lattner35abbf52007-10-23 06:23:57 +0000291<div class="doc_section"><a name="funcs">Function Code Generation</a></div>
Chris Lattner2e902042007-10-22 07:01:42 +0000292<!-- *********************************************************************** -->
293
294<div class="doc_text">
295
Chris Lattnerd96b1592007-11-07 05:07:10 +0000296<p>Code generation for prototypes and functions must handle a number of
297details, which make their code less beautiful than expression code
298generation, but allows us to illustrate some important points. First, lets
299talk about code generation for prototypes: they are used both for function
300bodies and external function declarations. The code starts with:</p>
Chris Lattner35abbf52007-10-23 06:23:57 +0000301
302<div class="doc_code">
303<pre>
304Function *PrototypeAST::Codegen() {
305 // Make the function type: double(double,double) etc.
306 std::vector&lt;const Type*&gt; Doubles(Args.size(), Type::DoubleTy);
307 FunctionType *FT = FunctionType::get(Type::DoubleTy, Doubles, false);
308
309 Function *F = new Function(FT, Function::ExternalLinkage, Name, TheModule);
310</pre>
311</div>
312
Chris Lattnercf9893d2007-11-05 19:22:50 +0000313<p>This code packs a lot of power into a few lines. Note first that this
Chris Lattnerd96b1592007-11-07 05:07:10 +0000314function returns a "Function*" instead of a "Value*". Because a "prototype"
315really talks about the external interface for a function (not the value computed
316by an expression), it makes sense for it to return the LLVM Function it
317corresponds to when codegen'd.</p>
Chris Lattnercf9893d2007-11-05 19:22:50 +0000318
Chris Lattnerd96b1592007-11-07 05:07:10 +0000319<p>The call to <tt>FunctionType::get</tt> creates
Chris Lattner35abbf52007-10-23 06:23:57 +0000320the <tt>FunctionType</tt> that should be used for a given Prototype. Since all
321function arguments in Kaleidoscope are of type double, the first line creates
Chris Lattnerd96b1592007-11-07 05:07:10 +0000322a vector of "N" LLVM double types. It then uses the <tt>FunctionType::get</tt>
Chris Lattner35abbf52007-10-23 06:23:57 +0000323method to create a function type that takes "N" doubles as arguments, returns
324one double as a result, and that is not vararg (the false parameter indicates
325this). Note that Types in LLVM are uniqued just like Constants are, so you
326don't "new" a type, you "get" it.</p>
327
328<p>The final line above actually creates the function that the prototype will
Chris Lattner41fcea32007-11-13 07:06:30 +0000329correspond to. This indicates the type, linkage and name to use, as well as which
Chris Lattner1ace67c2008-04-15 16:59:22 +0000330module to insert into. "<a href="../LangRef.html#linkage">external linkage</a>"
Chris Lattner35abbf52007-10-23 06:23:57 +0000331means that the function may be defined outside the current module and/or that it
332is callable by functions outside the module. The Name passed in is the name the
333user specified: since "<tt>TheModule</tt>" is specified, this name is registered
334in "<tt>TheModule</tt>"s symbol table, which is used by the function call code
335above.</p>
336
337<div class="doc_code">
338<pre>
339 // If F conflicted, there was already something named 'Name'. If it has a
340 // body, don't allow redefinition or reextern.
341 if (F-&gt;getName() != Name) {
342 // Delete the one we just made and get the existing one.
343 F-&gt;eraseFromParent();
344 F = TheModule-&gt;getFunction(Name);
345</pre>
346</div>
347
348<p>The Module symbol table works just like the Function symbol table when it
349comes to name conflicts: if a new function is created with a name was previously
350added to the symbol table, it will get implicitly renamed when added to the
Chris Lattner41fcea32007-11-13 07:06:30 +0000351Module. The code above exploits this fact to determine if there was a previous
Chris Lattner35abbf52007-10-23 06:23:57 +0000352definition of this function.</p>
353
354<p>In Kaleidoscope, I choose to allow redefinitions of functions in two cases:
Chris Lattnerd96b1592007-11-07 05:07:10 +0000355first, we want to allow 'extern'ing a function more than once, as long as the
Chris Lattner35abbf52007-10-23 06:23:57 +0000356prototypes for the externs match (since all arguments have the same type, we
357just have to check that the number of arguments match). Second, we want to
358allow 'extern'ing a function and then definining a body for it. This is useful
359when defining mutually recursive functions.</p>
360
361<p>In order to implement this, the code above first checks to see if there is
362a collision on the name of the function. If so, it deletes the function we just
363created (by calling <tt>eraseFromParent</tt>) and then calling
364<tt>getFunction</tt> to get the existing function with the specified name. Note
365that many APIs in LLVM have "erase" forms and "remove" forms. The "remove" form
366unlinks the object from its parent (e.g. a Function from a Module) and returns
367it. The "erase" form unlinks the object and then deletes it.</p>
368
369<div class="doc_code">
370<pre>
371 // If F already has a body, reject this.
372 if (!F-&gt;empty()) {
373 ErrorF("redefinition of function");
374 return 0;
375 }
376
377 // If F took a different number of args, reject.
378 if (F-&gt;arg_size() != Args.size()) {
379 ErrorF("redefinition of function with different # args");
380 return 0;
381 }
382 }
383</pre>
384</div>
385
Chris Lattnerd96b1592007-11-07 05:07:10 +0000386<p>In order to verify the logic above, we first check to see if the pre-existing
Chris Lattner35abbf52007-10-23 06:23:57 +0000387function is "empty". In this case, empty means that it has no basic blocks in
Chris Lattnerd96b1592007-11-07 05:07:10 +0000388it, which means it has no body. If it has no body, it is a forward
Chris Lattner35abbf52007-10-23 06:23:57 +0000389declaration. Since we don't allow anything after a full definition of the
390function, the code rejects this case. If the previous reference to a function
391was an 'extern', we simply verify that the number of arguments for that
392definition and this one match up. If not, we emit an error.</p>
393
394<div class="doc_code">
395<pre>
396 // Set names for all arguments.
397 unsigned Idx = 0;
398 for (Function::arg_iterator AI = F-&gt;arg_begin(); Idx != Args.size();
399 ++AI, ++Idx) {
400 AI-&gt;setName(Args[Idx]);
401
402 // Add arguments to variable symbol table.
403 NamedValues[Args[Idx]] = AI;
404 }
405 return F;
406}
407</pre>
408</div>
409
410<p>The last bit of code for prototypes loops over all of the arguments in the
Chris Lattner41fcea32007-11-13 07:06:30 +0000411function, setting the name of the LLVM Argument objects to match, and registering
Chris Lattner35abbf52007-10-23 06:23:57 +0000412the arguments in the <tt>NamedValues</tt> map for future use by the
413<tt>VariableExprAST</tt> AST node. Once this is set up, it returns the Function
414object to the caller. Note that we don't check for conflicting
415argument names here (e.g. "extern foo(a b a)"). Doing so would be very
Chris Lattnerd96b1592007-11-07 05:07:10 +0000416straight-forward with the mechanics we have already used above.</p>
Chris Lattner35abbf52007-10-23 06:23:57 +0000417
418<div class="doc_code">
419<pre>
420Function *FunctionAST::Codegen() {
421 NamedValues.clear();
422
423 Function *TheFunction = Proto-&gt;Codegen();
424 if (TheFunction == 0)
425 return 0;
426</pre>
427</div>
428
Chris Lattner41fcea32007-11-13 07:06:30 +0000429<p>Code generation for function definitions starts out simply enough: we just
430codegen the prototype (Proto) and verify that it is ok. We then clear out the
Chris Lattner35abbf52007-10-23 06:23:57 +0000431<tt>NamedValues</tt> map to make sure that there isn't anything in it from the
Chris Lattnera1cd2242007-11-06 05:07:30 +0000432last function we compiled. Code generation of the prototype ensures that there
433is an LLVM Function object that is ready to go for us.</p>
Chris Lattner35abbf52007-10-23 06:23:57 +0000434
435<div class="doc_code">
436<pre>
437 // Create a new basic block to start insertion into.
438 BasicBlock *BB = new BasicBlock("entry", TheFunction);
439 Builder.SetInsertPoint(BB);
440
441 if (Value *RetVal = Body-&gt;Codegen()) {
Chris Lattner35abbf52007-10-23 06:23:57 +0000442</pre>
443</div>
444
445<p>Now we get to the point where the <tt>Builder</tt> is set up. The first
446line creates a new <a href="http://en.wikipedia.org/wiki/Basic_block">basic
447block</a> (named "entry"), which is inserted into <tt>TheFunction</tt>. The
448second line then tells the builder that new instructions should be inserted into
449the end of the new basic block. Basic blocks in LLVM are an important part
450of functions that define the <a
451href="http://en.wikipedia.org/wiki/Control_flow_graph">Control Flow Graph</a>.
452Since we don't have any control flow, our functions will only contain one
Chris Lattner41fcea32007-11-13 07:06:30 +0000453block at this point. We'll fix this in <a href="LangImpl5.html">Chapter 5</a> :).</p>
Chris Lattner35abbf52007-10-23 06:23:57 +0000454
Chris Lattnerd9b86162007-10-25 04:30:35 +0000455<div class="doc_code">
456<pre>
457 if (Value *RetVal = Body-&gt;Codegen()) {
458 // Finish off the function.
459 Builder.CreateRet(RetVal);
460
461 // Validate the generated code, checking for consistency.
462 verifyFunction(*TheFunction);
463 return TheFunction;
464 }
465</pre>
466</div>
467
Chris Lattner35abbf52007-10-23 06:23:57 +0000468<p>Once the insertion point is set up, we call the <tt>CodeGen()</tt> method for
469the root expression of the function. If no error happens, this emits code to
470compute the expression into the entry block and returns the value that was
471computed. Assuming no error, we then create an LLVM <a
Chris Lattnerd9b86162007-10-25 04:30:35 +0000472href="../LangRef.html#i_ret">ret instruction</a>, which completes the function.
Chris Lattner41fcea32007-11-13 07:06:30 +0000473Once the function is built, we call <tt>verifyFunction</tt>, which
Chris Lattnerd9b86162007-10-25 04:30:35 +0000474is provided by LLVM. This function does a variety of consistency checks on the
475generated code, to determine if our compiler is doing everything right. Using
476this is important: it can catch a lot of bugs. Once the function is finished
477and validated, we return it.</p>
Chris Lattner35abbf52007-10-23 06:23:57 +0000478
479<div class="doc_code">
480<pre>
481 // Error reading body, remove function.
482 TheFunction-&gt;eraseFromParent();
483 return 0;
484}
485</pre>
486</div>
487
488<p>The only piece left here is handling of the error case. For simplicity, we
Chris Lattner41fcea32007-11-13 07:06:30 +0000489handle this by merely deleting the function we produced with the
Chris Lattner35abbf52007-10-23 06:23:57 +0000490<tt>eraseFromParent</tt> method. This allows the user to redefine a function
491that they incorrectly typed in before: if we didn't delete it, it would live in
492the symbol table, with a body, preventing future redefinition.</p>
493
Chris Lattner41fcea32007-11-13 07:06:30 +0000494<p>This code does have a bug, though. Since the <tt>PrototypeAST::Codegen</tt>
495can return a previously defined forward declaration, our code can actually delete
Chris Lattner35abbf52007-10-23 06:23:57 +0000496a forward declaration. There are a number of ways to fix this bug, see what you
497can come up with! Here is a testcase:</p>
498
499<div class="doc_code">
500<pre>
501extern foo(a b); # ok, defines foo.
502def foo(a b) c; # error, 'c' is invalid.
503def bar() foo(1, 2); # error, unknown function "foo"
504</pre>
505</div>
506
507</div>
508
509<!-- *********************************************************************** -->
510<div class="doc_section"><a name="driver">Driver Changes and
511Closing Thoughts</a></div>
512<!-- *********************************************************************** -->
513
514<div class="doc_text">
515
516<p>
517For now, code generation to LLVM doesn't really get us much, except that we can
518look at the pretty IR calls. The sample code inserts calls to Codegen into the
519"<tt>HandleDefinition</tt>", "<tt>HandleExtern</tt>" etc functions, and then
520dumps out the LLVM IR. This gives a nice way to look at the LLVM IR for simple
521functions. For example:
522</p>
523
524<div class="doc_code">
525<pre>
526ready> <b>4+5</b>;
Chris Lattnerd96b1592007-11-07 05:07:10 +0000527Read top-level expression:
Chris Lattner35abbf52007-10-23 06:23:57 +0000528define double @""() {
529entry:
530 %addtmp = add double 4.000000e+00, 5.000000e+00
531 ret double %addtmp
532}
533</pre>
534</div>
535
536<p>Note how the parser turns the top-level expression into anonymous functions
Chris Lattnerd96b1592007-11-07 05:07:10 +0000537for us. This will be handy when we add <a href="LangImpl4.html#jit">JIT
538support</a> in the next chapter. Also note that the code is very literally
539transcribed, no optimizations are being performed. We will
540<a href="LangImpl4.html#trivialconstfold">add optimizations</a> explicitly in
541the next chapter.</p>
Chris Lattner35abbf52007-10-23 06:23:57 +0000542
543<div class="doc_code">
544<pre>
545ready&gt; <b>def foo(a b) a*a + 2*a*b + b*b;</b>
Chris Lattnerd96b1592007-11-07 05:07:10 +0000546Read function definition:
Chris Lattner35abbf52007-10-23 06:23:57 +0000547define double @foo(double %a, double %b) {
548entry:
549 %multmp = mul double %a, %a
550 %multmp1 = mul double 2.000000e+00, %a
551 %multmp2 = mul double %multmp1, %b
552 %addtmp = add double %multmp, %multmp2
553 %multmp3 = mul double %b, %b
554 %addtmp4 = add double %addtmp, %multmp3
555 ret double %addtmp4
556}
557</pre>
558</div>
559
560<p>This shows some simple arithmetic. Notice the striking similarity to the
561LLVM builder calls that we use to create the instructions.</p>
562
563<div class="doc_code">
564<pre>
565ready&gt; <b>def bar(a) foo(a, 4.0) + bar(31337);</b>
Chris Lattnerd96b1592007-11-07 05:07:10 +0000566Read function definition:
Chris Lattner35abbf52007-10-23 06:23:57 +0000567define double @bar(double %a) {
568entry:
569 %calltmp = call double @foo( double %a, double 4.000000e+00 )
570 %calltmp1 = call double @bar( double 3.133700e+04 )
571 %addtmp = add double %calltmp, %calltmp1
572 ret double %addtmp
573}
574</pre>
575</div>
576
Chris Lattnerfc60ab02007-11-05 17:39:26 +0000577<p>This shows some function calls. Note that this function will take a long
578time to execute if you call it. In the future we'll add conditional control
Chris Lattner41fcea32007-11-13 07:06:30 +0000579flow to actually make recursion useful :).</p>
Chris Lattner35abbf52007-10-23 06:23:57 +0000580
581<div class="doc_code">
582<pre>
583ready&gt; <b>extern cos(x);</b>
Chris Lattnerd96b1592007-11-07 05:07:10 +0000584Read extern:
Chris Lattner35abbf52007-10-23 06:23:57 +0000585declare double @cos(double)
586
587ready&gt; <b>cos(1.234);</b>
Chris Lattnerd96b1592007-11-07 05:07:10 +0000588Read top-level expression:
Chris Lattner35abbf52007-10-23 06:23:57 +0000589define double @""() {
590entry:
Chris Lattner8eef4b22007-10-23 06:30:50 +0000591 %calltmp = call double @cos( double 1.234000e+00 )
Chris Lattner35abbf52007-10-23 06:23:57 +0000592 ret double %calltmp
593}
594</pre>
595</div>
596
597<p>This shows an extern for the libm "cos" function, and a call to it.</p>
598
599
600<div class="doc_code">
601<pre>
602ready&gt; <b>^D</b>
603; ModuleID = 'my cool jit'
604
605define double @""() {
606entry:
607 %addtmp = add double 4.000000e+00, 5.000000e+00
608 ret double %addtmp
609}
610
611define double @foo(double %a, double %b) {
612entry:
613 %multmp = mul double %a, %a
614 %multmp1 = mul double 2.000000e+00, %a
615 %multmp2 = mul double %multmp1, %b
616 %addtmp = add double %multmp, %multmp2
617 %multmp3 = mul double %b, %b
618 %addtmp4 = add double %addtmp, %multmp3
619 ret double %addtmp4
620}
621
622define double @bar(double %a) {
623entry:
624 %calltmp = call double @foo( double %a, double 4.000000e+00 )
625 %calltmp1 = call double @bar( double 3.133700e+04 )
626 %addtmp = add double %calltmp, %calltmp1
627 ret double %addtmp
628}
629
630declare double @cos(double)
631
632define double @""() {
633entry:
634 %calltmp = call double @cos( double 1.234000e+00 )
635 ret double %calltmp
636}
637</pre>
638</div>
639
640<p>When you quit the current demo, it dumps out the IR for the entire module
641generated. Here you can see the big picture with all the functions referencing
642each other.</p>
643
Chris Lattner41fcea32007-11-13 07:06:30 +0000644<p>This wraps up the third chapter of the Kaleidoscope tutorial. Up next, we'll
Chris Lattner35abbf52007-10-23 06:23:57 +0000645describe how to <a href="LangImpl4.html">add JIT codegen and optimizer
646support</a> to this so we can actually start running code!</p>
647
648</div>
649
650
651<!-- *********************************************************************** -->
652<div class="doc_section"><a name="code">Full Code Listing</a></div>
653<!-- *********************************************************************** -->
654
655<div class="doc_text">
656
657<p>
658Here is the complete code listing for our running example, enhanced with the
659LLVM code generator. Because this uses the LLVM libraries, we need to link
660them in. To do this, we use the <a
661href="http://llvm.org/cmds/llvm-config.html">llvm-config</a> tool to inform
662our makefile/command line about which options to use:</p>
663
664<div class="doc_code">
665<pre>
666 # Compile
Chris Lattnerd96b1592007-11-07 05:07:10 +0000667 g++ -g -O3 toy.cpp `llvm-config --cppflags --ldflags --libs core` -o toy
Chris Lattner35abbf52007-10-23 06:23:57 +0000668 # Run
669 ./toy
670</pre>
671</div>
672
673<p>Here is the code:</p>
674
Chris Lattner2e902042007-10-22 07:01:42 +0000675<div class="doc_code">
676<pre>
677// To build this:
Chris Lattner2e902042007-10-22 07:01:42 +0000678// See example below.
679
680#include "llvm/DerivedTypes.h"
681#include "llvm/Module.h"
Chris Lattnerd9b86162007-10-25 04:30:35 +0000682#include "llvm/Analysis/Verifier.h"
Duncan Sands89f6d882008-04-13 06:22:09 +0000683#include "llvm/Support/IRBuilder.h"
Chris Lattner2e902042007-10-22 07:01:42 +0000684#include &lt;cstdio&gt;
685#include &lt;string&gt;
686#include &lt;map&gt;
687#include &lt;vector&gt;
688using namespace llvm;
689
690//===----------------------------------------------------------------------===//
691// Lexer
692//===----------------------------------------------------------------------===//
693
694// The lexer returns tokens [0-255] if it is an unknown character, otherwise one
695// of these for known things.
696enum Token {
697 tok_eof = -1,
698
699 // commands
700 tok_def = -2, tok_extern = -3,
701
702 // primary
703 tok_identifier = -4, tok_number = -5,
704};
705
706static std::string IdentifierStr; // Filled in if tok_identifier
707static double NumVal; // Filled in if tok_number
708
709/// gettok - Return the next token from standard input.
710static int gettok() {
711 static int LastChar = ' ';
712
713 // Skip any whitespace.
714 while (isspace(LastChar))
715 LastChar = getchar();
716
717 if (isalpha(LastChar)) { // identifier: [a-zA-Z][a-zA-Z0-9]*
718 IdentifierStr = LastChar;
719 while (isalnum((LastChar = getchar())))
720 IdentifierStr += LastChar;
721
722 if (IdentifierStr == "def") return tok_def;
723 if (IdentifierStr == "extern") return tok_extern;
724 return tok_identifier;
725 }
726
727 if (isdigit(LastChar) || LastChar == '.') { // Number: [0-9.]+
728 std::string NumStr;
729 do {
730 NumStr += LastChar;
731 LastChar = getchar();
732 } while (isdigit(LastChar) || LastChar == '.');
733
734 NumVal = strtod(NumStr.c_str(), 0);
735 return tok_number;
736 }
737
738 if (LastChar == '#') {
739 // Comment until end of line.
740 do LastChar = getchar();
Chris Lattnerc80c23f2007-12-02 22:46:01 +0000741 while (LastChar != EOF &amp;&amp; LastChar != '\n' &amp;&amp; LastChar != '\r');
Chris Lattner2e902042007-10-22 07:01:42 +0000742
743 if (LastChar != EOF)
744 return gettok();
745 }
746
747 // Check for end of file. Don't eat the EOF.
748 if (LastChar == EOF)
749 return tok_eof;
750
751 // Otherwise, just return the character as its ascii value.
752 int ThisChar = LastChar;
753 LastChar = getchar();
754 return ThisChar;
755}
756
757//===----------------------------------------------------------------------===//
758// Abstract Syntax Tree (aka Parse Tree)
759//===----------------------------------------------------------------------===//
760
761/// ExprAST - Base class for all expression nodes.
762class ExprAST {
763public:
764 virtual ~ExprAST() {}
765 virtual Value *Codegen() = 0;
766};
767
768/// NumberExprAST - Expression class for numeric literals like "1.0".
769class NumberExprAST : public ExprAST {
770 double Val;
771public:
Chris Lattner28571ed2007-10-23 04:27:44 +0000772 explicit NumberExprAST(double val) : Val(val) {}
Chris Lattner2e902042007-10-22 07:01:42 +0000773 virtual Value *Codegen();
774};
775
776/// VariableExprAST - Expression class for referencing a variable, like "a".
777class VariableExprAST : public ExprAST {
778 std::string Name;
779public:
Chris Lattner28571ed2007-10-23 04:27:44 +0000780 explicit VariableExprAST(const std::string &amp;name) : Name(name) {}
Chris Lattner2e902042007-10-22 07:01:42 +0000781 virtual Value *Codegen();
782};
783
784/// BinaryExprAST - Expression class for a binary operator.
785class BinaryExprAST : public ExprAST {
786 char Op;
787 ExprAST *LHS, *RHS;
788public:
789 BinaryExprAST(char op, ExprAST *lhs, ExprAST *rhs)
790 : Op(op), LHS(lhs), RHS(rhs) {}
791 virtual Value *Codegen();
792};
793
794/// CallExprAST - Expression class for function calls.
795class CallExprAST : public ExprAST {
796 std::string Callee;
797 std::vector&lt;ExprAST*&gt; Args;
798public:
799 CallExprAST(const std::string &amp;callee, std::vector&lt;ExprAST*&gt; &amp;args)
800 : Callee(callee), Args(args) {}
801 virtual Value *Codegen();
802};
803
804/// PrototypeAST - This class represents the "prototype" for a function,
805/// which captures its argument names as well as if it is an operator.
806class PrototypeAST {
807 std::string Name;
808 std::vector&lt;std::string&gt; Args;
809public:
810 PrototypeAST(const std::string &amp;name, const std::vector&lt;std::string&gt; &amp;args)
811 : Name(name), Args(args) {}
812
813 Function *Codegen();
814};
815
816/// FunctionAST - This class represents a function definition itself.
817class FunctionAST {
818 PrototypeAST *Proto;
819 ExprAST *Body;
820public:
821 FunctionAST(PrototypeAST *proto, ExprAST *body)
822 : Proto(proto), Body(body) {}
823
824 Function *Codegen();
825};
826
827//===----------------------------------------------------------------------===//
828// Parser
829//===----------------------------------------------------------------------===//
830
831/// CurTok/getNextToken - Provide a simple token buffer. CurTok is the current
832/// token the parser it looking at. getNextToken reads another token from the
833/// lexer and updates CurTok with its results.
834static int CurTok;
835static int getNextToken() {
836 return CurTok = gettok();
837}
838
839/// BinopPrecedence - This holds the precedence for each binary operator that is
840/// defined.
841static std::map&lt;char, int&gt; BinopPrecedence;
842
843/// GetTokPrecedence - Get the precedence of the pending binary operator token.
844static int GetTokPrecedence() {
845 if (!isascii(CurTok))
846 return -1;
847
848 // Make sure it's a declared binop.
849 int TokPrec = BinopPrecedence[CurTok];
850 if (TokPrec &lt;= 0) return -1;
851 return TokPrec;
852}
853
854/// Error* - These are little helper functions for error handling.
855ExprAST *Error(const char *Str) { fprintf(stderr, "Error: %s\n", Str);return 0;}
856PrototypeAST *ErrorP(const char *Str) { Error(Str); return 0; }
857FunctionAST *ErrorF(const char *Str) { Error(Str); return 0; }
858
859static ExprAST *ParseExpression();
860
861/// identifierexpr
Chris Lattner20a0c802007-11-05 17:54:34 +0000862/// ::= identifier
863/// ::= identifier '(' expression* ')'
Chris Lattner2e902042007-10-22 07:01:42 +0000864static ExprAST *ParseIdentifierExpr() {
865 std::string IdName = IdentifierStr;
866
Chris Lattner20a0c802007-11-05 17:54:34 +0000867 getNextToken(); // eat identifier.
Chris Lattner2e902042007-10-22 07:01:42 +0000868
869 if (CurTok != '(') // Simple variable ref.
870 return new VariableExprAST(IdName);
871
872 // Call.
873 getNextToken(); // eat (
874 std::vector&lt;ExprAST*&gt; Args;
Chris Lattner71155212007-11-06 01:39:12 +0000875 if (CurTok != ')') {
876 while (1) {
877 ExprAST *Arg = ParseExpression();
878 if (!Arg) return 0;
879 Args.push_back(Arg);
Chris Lattner2e902042007-10-22 07:01:42 +0000880
Chris Lattner71155212007-11-06 01:39:12 +0000881 if (CurTok == ')') break;
Chris Lattner2e902042007-10-22 07:01:42 +0000882
Chris Lattner71155212007-11-06 01:39:12 +0000883 if (CurTok != ',')
Chris Lattner6c4be9c2008-04-14 16:44:41 +0000884 return Error("Expected ')' or ',' in argument list");
Chris Lattner71155212007-11-06 01:39:12 +0000885 getNextToken();
886 }
Chris Lattner2e902042007-10-22 07:01:42 +0000887 }
888
889 // Eat the ')'.
890 getNextToken();
891
892 return new CallExprAST(IdName, Args);
893}
894
895/// numberexpr ::= number
896static ExprAST *ParseNumberExpr() {
897 ExprAST *Result = new NumberExprAST(NumVal);
898 getNextToken(); // consume the number
899 return Result;
900}
901
902/// parenexpr ::= '(' expression ')'
903static ExprAST *ParseParenExpr() {
904 getNextToken(); // eat (.
905 ExprAST *V = ParseExpression();
906 if (!V) return 0;
907
908 if (CurTok != ')')
909 return Error("expected ')'");
910 getNextToken(); // eat ).
911 return V;
912}
913
914/// primary
915/// ::= identifierexpr
916/// ::= numberexpr
917/// ::= parenexpr
918static ExprAST *ParsePrimary() {
919 switch (CurTok) {
920 default: return Error("unknown token when expecting an expression");
921 case tok_identifier: return ParseIdentifierExpr();
922 case tok_number: return ParseNumberExpr();
923 case '(': return ParseParenExpr();
924 }
925}
926
927/// binoprhs
928/// ::= ('+' primary)*
929static ExprAST *ParseBinOpRHS(int ExprPrec, ExprAST *LHS) {
930 // If this is a binop, find its precedence.
931 while (1) {
932 int TokPrec = GetTokPrecedence();
933
934 // If this is a binop that binds at least as tightly as the current binop,
935 // consume it, otherwise we are done.
936 if (TokPrec &lt; ExprPrec)
937 return LHS;
938
939 // Okay, we know this is a binop.
940 int BinOp = CurTok;
941 getNextToken(); // eat binop
942
943 // Parse the primary expression after the binary operator.
944 ExprAST *RHS = ParsePrimary();
945 if (!RHS) return 0;
946
947 // If BinOp binds less tightly with RHS than the operator after RHS, let
948 // the pending operator take RHS as its LHS.
949 int NextPrec = GetTokPrecedence();
950 if (TokPrec &lt; NextPrec) {
951 RHS = ParseBinOpRHS(TokPrec+1, RHS);
952 if (RHS == 0) return 0;
953 }
954
955 // Merge LHS/RHS.
956 LHS = new BinaryExprAST(BinOp, LHS, RHS);
957 }
958}
959
960/// expression
961/// ::= primary binoprhs
962///
963static ExprAST *ParseExpression() {
964 ExprAST *LHS = ParsePrimary();
965 if (!LHS) return 0;
966
967 return ParseBinOpRHS(0, LHS);
968}
969
970/// prototype
971/// ::= id '(' id* ')'
972static PrototypeAST *ParsePrototype() {
973 if (CurTok != tok_identifier)
974 return ErrorP("Expected function name in prototype");
975
976 std::string FnName = IdentifierStr;
977 getNextToken();
978
979 if (CurTok != '(')
980 return ErrorP("Expected '(' in prototype");
981
982 std::vector&lt;std::string&gt; ArgNames;
983 while (getNextToken() == tok_identifier)
984 ArgNames.push_back(IdentifierStr);
985 if (CurTok != ')')
986 return ErrorP("Expected ')' in prototype");
987
988 // success.
989 getNextToken(); // eat ')'.
990
991 return new PrototypeAST(FnName, ArgNames);
992}
993
994/// definition ::= 'def' prototype expression
995static FunctionAST *ParseDefinition() {
996 getNextToken(); // eat def.
997 PrototypeAST *Proto = ParsePrototype();
998 if (Proto == 0) return 0;
999
1000 if (ExprAST *E = ParseExpression())
1001 return new FunctionAST(Proto, E);
1002 return 0;
1003}
1004
1005/// toplevelexpr ::= expression
1006static FunctionAST *ParseTopLevelExpr() {
1007 if (ExprAST *E = ParseExpression()) {
1008 // Make an anonymous proto.
1009 PrototypeAST *Proto = new PrototypeAST("", std::vector&lt;std::string&gt;());
1010 return new FunctionAST(Proto, E);
1011 }
1012 return 0;
1013}
1014
1015/// external ::= 'extern' prototype
1016static PrototypeAST *ParseExtern() {
1017 getNextToken(); // eat extern.
1018 return ParsePrototype();
1019}
1020
1021//===----------------------------------------------------------------------===//
1022// Code Generation
1023//===----------------------------------------------------------------------===//
1024
1025static Module *TheModule;
Duncan Sands89f6d882008-04-13 06:22:09 +00001026static IRBuilder Builder;
Chris Lattner2e902042007-10-22 07:01:42 +00001027static std::map&lt;std::string, Value*&gt; NamedValues;
1028
1029Value *ErrorV(const char *Str) { Error(Str); return 0; }
1030
1031Value *NumberExprAST::Codegen() {
1032 return ConstantFP::get(Type::DoubleTy, APFloat(Val));
1033}
1034
1035Value *VariableExprAST::Codegen() {
1036 // Look this variable up in the function.
1037 Value *V = NamedValues[Name];
1038 return V ? V : ErrorV("Unknown variable name");
1039}
1040
1041Value *BinaryExprAST::Codegen() {
1042 Value *L = LHS-&gt;Codegen();
1043 Value *R = RHS-&gt;Codegen();
1044 if (L == 0 || R == 0) return 0;
1045
1046 switch (Op) {
1047 case '+': return Builder.CreateAdd(L, R, "addtmp");
1048 case '-': return Builder.CreateSub(L, R, "subtmp");
1049 case '*': return Builder.CreateMul(L, R, "multmp");
1050 case '&lt;':
Chris Lattner71155212007-11-06 01:39:12 +00001051 L = Builder.CreateFCmpULT(L, R, "cmptmp");
Chris Lattner2e902042007-10-22 07:01:42 +00001052 // Convert bool 0/1 to double 0.0 or 1.0
1053 return Builder.CreateUIToFP(L, Type::DoubleTy, "booltmp");
1054 default: return ErrorV("invalid binary operator");
1055 }
1056}
1057
1058Value *CallExprAST::Codegen() {
1059 // Look up the name in the global module table.
1060 Function *CalleeF = TheModule-&gt;getFunction(Callee);
1061 if (CalleeF == 0)
1062 return ErrorV("Unknown function referenced");
1063
1064 // If argument mismatch error.
1065 if (CalleeF-&gt;arg_size() != Args.size())
1066 return ErrorV("Incorrect # arguments passed");
1067
1068 std::vector&lt;Value*&gt; ArgsV;
1069 for (unsigned i = 0, e = Args.size(); i != e; ++i) {
1070 ArgsV.push_back(Args[i]-&gt;Codegen());
1071 if (ArgsV.back() == 0) return 0;
1072 }
1073
1074 return Builder.CreateCall(CalleeF, ArgsV.begin(), ArgsV.end(), "calltmp");
1075}
1076
1077Function *PrototypeAST::Codegen() {
1078 // Make the function type: double(double,double) etc.
Chris Lattner35abbf52007-10-23 06:23:57 +00001079 std::vector&lt;const Type*&gt; Doubles(Args.size(), Type::DoubleTy);
1080 FunctionType *FT = FunctionType::get(Type::DoubleTy, Doubles, false);
Chris Lattner2e902042007-10-22 07:01:42 +00001081
1082 Function *F = new Function(FT, Function::ExternalLinkage, Name, TheModule);
1083
1084 // If F conflicted, there was already something named 'Name'. If it has a
1085 // body, don't allow redefinition or reextern.
1086 if (F-&gt;getName() != Name) {
1087 // Delete the one we just made and get the existing one.
1088 F-&gt;eraseFromParent();
1089 F = TheModule-&gt;getFunction(Name);
1090
1091 // If F already has a body, reject this.
1092 if (!F-&gt;empty()) {
1093 ErrorF("redefinition of function");
1094 return 0;
1095 }
1096
1097 // If F took a different number of args, reject.
1098 if (F-&gt;arg_size() != Args.size()) {
1099 ErrorF("redefinition of function with different # args");
1100 return 0;
1101 }
1102 }
1103
1104 // Set names for all arguments.
1105 unsigned Idx = 0;
1106 for (Function::arg_iterator AI = F-&gt;arg_begin(); Idx != Args.size();
1107 ++AI, ++Idx) {
1108 AI-&gt;setName(Args[Idx]);
1109
1110 // Add arguments to variable symbol table.
1111 NamedValues[Args[Idx]] = AI;
1112 }
1113
1114 return F;
1115}
1116
1117Function *FunctionAST::Codegen() {
1118 NamedValues.clear();
1119
1120 Function *TheFunction = Proto-&gt;Codegen();
1121 if (TheFunction == 0)
1122 return 0;
1123
1124 // Create a new basic block to start insertion into.
Chris Lattner35abbf52007-10-23 06:23:57 +00001125 BasicBlock *BB = new BasicBlock("entry", TheFunction);
1126 Builder.SetInsertPoint(BB);
Chris Lattner2e902042007-10-22 07:01:42 +00001127
1128 if (Value *RetVal = Body-&gt;Codegen()) {
1129 // Finish off the function.
1130 Builder.CreateRet(RetVal);
Chris Lattnerd9b86162007-10-25 04:30:35 +00001131
1132 // Validate the generated code, checking for consistency.
1133 verifyFunction(*TheFunction);
Chris Lattner2e902042007-10-22 07:01:42 +00001134 return TheFunction;
1135 }
1136
1137 // Error reading body, remove function.
1138 TheFunction-&gt;eraseFromParent();
1139 return 0;
1140}
1141
1142//===----------------------------------------------------------------------===//
1143// Top-Level parsing and JIT Driver
1144//===----------------------------------------------------------------------===//
1145
1146static void HandleDefinition() {
1147 if (FunctionAST *F = ParseDefinition()) {
1148 if (Function *LF = F-&gt;Codegen()) {
1149 fprintf(stderr, "Read function definition:");
1150 LF-&gt;dump();
1151 }
1152 } else {
1153 // Skip token for error recovery.
1154 getNextToken();
1155 }
1156}
1157
1158static void HandleExtern() {
1159 if (PrototypeAST *P = ParseExtern()) {
1160 if (Function *F = P-&gt;Codegen()) {
1161 fprintf(stderr, "Read extern: ");
1162 F-&gt;dump();
1163 }
1164 } else {
1165 // Skip token for error recovery.
1166 getNextToken();
1167 }
1168}
1169
1170static void HandleTopLevelExpression() {
1171 // Evaluate a top level expression into an anonymous function.
1172 if (FunctionAST *F = ParseTopLevelExpr()) {
1173 if (Function *LF = F-&gt;Codegen()) {
1174 fprintf(stderr, "Read top-level expression:");
1175 LF-&gt;dump();
1176 }
1177 } else {
1178 // Skip token for error recovery.
1179 getNextToken();
1180 }
1181}
1182
1183/// top ::= definition | external | expression | ';'
1184static void MainLoop() {
1185 while (1) {
1186 fprintf(stderr, "ready&gt; ");
1187 switch (CurTok) {
1188 case tok_eof: return;
1189 case ';': getNextToken(); break; // ignore top level semicolons.
1190 case tok_def: HandleDefinition(); break;
1191 case tok_extern: HandleExtern(); break;
1192 default: HandleTopLevelExpression(); break;
1193 }
1194 }
1195}
1196
1197
1198
1199//===----------------------------------------------------------------------===//
1200// "Library" functions that can be "extern'd" from user code.
1201//===----------------------------------------------------------------------===//
1202
1203/// putchard - putchar that takes a double and returns 0.
1204extern "C"
1205double putchard(double X) {
1206 putchar((char)X);
1207 return 0;
1208}
1209
1210//===----------------------------------------------------------------------===//
1211// Main driver code.
1212//===----------------------------------------------------------------------===//
1213
1214int main() {
1215 TheModule = new Module("my cool jit");
1216
1217 // Install standard binary operators.
1218 // 1 is lowest precedence.
1219 BinopPrecedence['&lt;'] = 10;
1220 BinopPrecedence['+'] = 20;
1221 BinopPrecedence['-'] = 20;
1222 BinopPrecedence['*'] = 40; // highest.
1223
1224 // Prime the first token.
1225 fprintf(stderr, "ready&gt; ");
1226 getNextToken();
1227
1228 MainLoop();
1229 TheModule-&gt;dump();
1230 return 0;
1231}
Chris Lattner2e902042007-10-22 07:01:42 +00001232</pre>
1233</div>
Chris Lattner729eb142008-02-10 19:11:04 +00001234<a href="LangImpl4.html">Next: Adding JIT and Optimizer Support</a>
Chris Lattner2e902042007-10-22 07:01:42 +00001235</div>
1236
1237<!-- *********************************************************************** -->
1238<hr>
1239<address>
1240 <a href="http://jigsaw.w3.org/css-validator/check/referer"><img
1241 src="http://jigsaw.w3.org/css-validator/images/vcss" alt="Valid CSS!"></a>
1242 <a href="http://validator.w3.org/check/referer"><img
Chris Lattner8eef4b22007-10-23 06:30:50 +00001243 src="http://www.w3.org/Icons/valid-html401" alt="Valid HTML 4.01!"></a>
Chris Lattner2e902042007-10-22 07:01:42 +00001244
1245 <a href="mailto:sabre@nondot.org">Chris Lattner</a><br>
1246 <a href="http://llvm.org">The LLVM Compiler Infrastructure</a><br>
1247 Last modified: $Date: 2007-10-17 11:05:13 -0700 (Wed, 17 Oct 2007) $
1248</address>
1249</body>
1250</html>