blob: 0258961ce6521c683455e4051d10da51582c7863 [file] [log] [blame]
Chris Lattner2e902042007-10-22 07:01:42 +00001<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01//EN"
2 "http://www.w3.org/TR/html4/strict.dtd">
3
4<html>
5<head>
6 <title>Kaleidoscope: Implementing code generation to LLVM IR</title>
7 <meta http-equiv="Content-Type" content="text/html; charset=utf-8">
8 <meta name="author" content="Chris Lattner">
9 <link rel="stylesheet" href="../llvm.css" type="text/css">
10</head>
11
12<body>
13
14<div class="doc_title">Kaleidoscope: Code generation to LLVM IR</div>
15
Chris Lattner128eb862007-11-05 19:06:59 +000016<ul>
Chris Lattner0e555b12007-11-05 20:04:56 +000017<li><a href="index.html">Up to Tutorial Index</a></li>
Chris Lattner128eb862007-11-05 19:06:59 +000018<li>Chapter 3
19 <ol>
20 <li><a href="#intro">Chapter 3 Introduction</a></li>
Chris Lattner7badb2d2007-11-06 07:26:32 +000021 <li><a href="#basics">Code Generation Setup</a></li>
Chris Lattner128eb862007-11-05 19:06:59 +000022 <li><a href="#exprs">Expression Code Generation</a></li>
23 <li><a href="#funcs">Function Code Generation</a></li>
24 <li><a href="#driver">Driver Changes and Closing Thoughts</a></li>
25 <li><a href="#code">Full Code Listing</a></li>
26 </ol>
27</li>
Chris Lattner0e555b12007-11-05 20:04:56 +000028<li><a href="LangImpl4.html">Chapter 4</a>: Adding JIT and Optimizer
29Support</li>
Chris Lattner128eb862007-11-05 19:06:59 +000030</ul>
31
Chris Lattner2e902042007-10-22 07:01:42 +000032<div class="doc_author">
33 <p>Written by <a href="mailto:sabre@nondot.org">Chris Lattner</a></p>
34</div>
35
36<!-- *********************************************************************** -->
Chris Lattner128eb862007-11-05 19:06:59 +000037<div class="doc_section"><a name="intro">Chapter 3 Introduction</a></div>
Chris Lattner2e902042007-10-22 07:01:42 +000038<!-- *********************************************************************** -->
39
40<div class="doc_text">
41
Chris Lattner128eb862007-11-05 19:06:59 +000042<p>Welcome to Chapter 3 of the "<a href="index.html">Implementing a language
43with LLVM</a>" tutorial. This chapter shows you how to transform the <a
Chris Lattner41fcea32007-11-13 07:06:30 +000044href="LangImpl2.html">Abstract Syntax Tree</a>, built in Chapter 2, into LLVM IR.
Chris Lattner2e902042007-10-22 07:01:42 +000045This will teach you a little bit about how LLVM does things, as well as
46demonstrate how easy it is to use. It's much more work to build a lexer and
Chris Lattner7badb2d2007-11-06 07:26:32 +000047parser than it is to generate LLVM IR code. :)
Chris Lattner2e902042007-10-22 07:01:42 +000048</p>
49
50</div>
51
52<!-- *********************************************************************** -->
Chris Lattner7badb2d2007-11-06 07:26:32 +000053<div class="doc_section"><a name="basics">Code Generation Setup</a></div>
Chris Lattner2e902042007-10-22 07:01:42 +000054<!-- *********************************************************************** -->
55
56<div class="doc_text">
57
58<p>
59In order to generate LLVM IR, we want some simple setup to get started. First,
60we define virtual codegen methods in each AST class:</p>
61
62<div class="doc_code">
63<pre>
64/// ExprAST - Base class for all expression nodes.
65class ExprAST {
66public:
67 virtual ~ExprAST() {}
Chris Lattnercac21352007-11-05 19:25:14 +000068 <b>virtual Value *Codegen() = 0;</b>
Chris Lattner2e902042007-10-22 07:01:42 +000069};
70
71/// NumberExprAST - Expression class for numeric literals like "1.0".
72class NumberExprAST : public ExprAST {
73 double Val;
74public:
Chris Lattner28571ed2007-10-23 04:27:44 +000075 explicit NumberExprAST(double val) : Val(val) {}
Chris Lattnercac21352007-11-05 19:25:14 +000076 <b>virtual Value *Codegen();</b>
Chris Lattner2e902042007-10-22 07:01:42 +000077};
78...
79</pre>
80</div>
81
Chris Lattner41fcea32007-11-13 07:06:30 +000082<p>The Codegen() method says to emit IR for that AST node along with all the things it
Chris Lattner28571ed2007-10-23 04:27:44 +000083depends on, and they all return an LLVM Value object.
84"Value" is the class used to represent a "<a
85href="http://en.wikipedia.org/wiki/Static_single_assignment_form">Static Single
86Assignment (SSA)</a> register" or "SSA value" in LLVM. The most distinct aspect
87of SSA values is that their value is computed as the related instruction
88executes, and it does not get a new value until (and if) the instruction
Chris Lattner41fcea32007-11-13 07:06:30 +000089re-executes. In other words, there is no way to "change" an SSA value. For
Chris Lattner28571ed2007-10-23 04:27:44 +000090more information, please read up on <a
91href="http://en.wikipedia.org/wiki/Static_single_assignment_form">Static Single
92Assignment</a> - the concepts are really quite natural once you grok them.</p>
93
Chris Lattnercac21352007-11-05 19:25:14 +000094<p>Note that instead of adding virtual methods to the ExprAST class hierarchy,
95it could also make sense to use a visitor pattern or some other way to model
96this. Again, this tutorial won't dwell on good software engineering practices:
Chris Lattnerbb80f932007-11-05 19:33:52 +000097for our purposes, adding a virtual method is simplest.</p>
Chris Lattnercac21352007-11-05 19:25:14 +000098
Chris Lattner28571ed2007-10-23 04:27:44 +000099<p>The
Chris Lattner41fcea32007-11-13 07:06:30 +0000100second thing we want is an "Error" method like we used for the parser, which will
Chris Lattner2e902042007-10-22 07:01:42 +0000101be used to report errors found during code generation (for example, use of an
102undeclared parameter):</p>
103
104<div class="doc_code">
105<pre>
106Value *ErrorV(const char *Str) { Error(Str); return 0; }
107
108static Module *TheModule;
109static LLVMBuilder Builder;
110static std::map&lt;std::string, Value*&gt; NamedValues;
111</pre>
112</div>
113
114<p>The static variables will be used during code generation. <tt>TheModule</tt>
115is the LLVM construct that contains all of the functions and global variables in
116a chunk of code. In many ways, it is the top-level structure that the LLVM IR
117uses to contain code.</p>
118
119<p>The <tt>Builder</tt> object is a helper object that makes it easy to generate
Chris Lattnerf6e53df2007-11-05 18:02:15 +0000120LLVM instructions. Instances of the <a
121href="http://llvm.org/doxygen/LLVMBuilder_8h-source.html"><tt>LLVMBuilder</tt>
Chris Lattner7badb2d2007-11-06 07:26:32 +0000122class</a> keep track of the current place to
Chris Lattner2e902042007-10-22 07:01:42 +0000123insert instructions and has methods to create new instructions.</p>
124
125<p>The <tt>NamedValues</tt> map keeps track of which values are defined in the
Chris Lattner7badb2d2007-11-06 07:26:32 +0000126current scope and what their LLVM representation is (in other words, it is a
127symbol table for the code). In this form of
Chris Lattner2e902042007-10-22 07:01:42 +0000128Kaleidoscope, the only things that can be referenced are function parameters.
129As such, function parameters will be in this map when generating code for their
130function body.</p>
131
132<p>
133With these basics in place, we can start talking about how to generate code for
134each expression. Note that this assumes that the <tt>Builder</tt> has been set
135up to generate code <em>into</em> something. For now, we'll assume that this
136has already been done, and we'll just use it to emit code.
137</p>
138
139</div>
140
141<!-- *********************************************************************** -->
142<div class="doc_section"><a name="exprs">Expression Code Generation</a></div>
143<!-- *********************************************************************** -->
144
145<div class="doc_text">
146
Chris Lattner41fcea32007-11-13 07:06:30 +0000147<p>Generating LLVM code for expression nodes is very straightforward: less
Chris Lattner2e902042007-10-22 07:01:42 +0000148than 45 lines of commented code for all four of our expression nodes. First,
149we'll do numeric literals:</p>
150
151<div class="doc_code">
152<pre>
153Value *NumberExprAST::Codegen() {
154 return ConstantFP::get(Type::DoubleTy, APFloat(Val));
155}
156</pre>
157</div>
158
Chris Lattnerd3f0cdd2007-10-23 04:51:30 +0000159<p>In the LLVM IR, numeric constants are represented with the
160<tt>ConstantFP</tt> class, which holds the numeric value in an <tt>APFloat</tt>
161internally (<tt>APFloat</tt> has the capability of holding floating point
162constants of <em>A</em>rbitrary <em>P</em>recision). This code basically just
163creates and returns a <tt>ConstantFP</tt>. Note that in the LLVM IR
Chris Lattner2e902042007-10-22 07:01:42 +0000164that constants are all uniqued together and shared. For this reason, the API
Chris Lattner7badb2d2007-11-06 07:26:32 +0000165uses "the foo::get(..)" idiom instead of "new foo(..)" or "foo::create(..)".</p>
Chris Lattner2e902042007-10-22 07:01:42 +0000166
167<div class="doc_code">
168<pre>
169Value *VariableExprAST::Codegen() {
170 // Look this variable up in the function.
171 Value *V = NamedValues[Name];
172 return V ? V : ErrorV("Unknown variable name");
173}
174</pre>
175</div>
176
Chris Lattner41fcea32007-11-13 07:06:30 +0000177<p>References to variables are also quite simple using LLVM. In the simple version
Chris Lattnerd3f0cdd2007-10-23 04:51:30 +0000178of Kaleidoscope, we assume that the variable has already been emited somewhere
179and its value is available. In practice, the only values that can be in the
180<tt>NamedValues</tt> map are function arguments. This
Chris Lattner2e902042007-10-22 07:01:42 +0000181code simply checks to see that the specified name is in the map (if not, an
Chris Lattner7badb2d2007-11-06 07:26:32 +0000182unknown variable is being referenced) and returns the value for it. In future
183chapters, we'll add support for <a href="LangImpl5.html#for">loop induction
184variables</a> in the symbol table, and for <a
185href="LangImpl7.html#localvars">local variables</a>.</p>
Chris Lattner2e902042007-10-22 07:01:42 +0000186
187<div class="doc_code">
188<pre>
189Value *BinaryExprAST::Codegen() {
190 Value *L = LHS-&gt;Codegen();
191 Value *R = RHS-&gt;Codegen();
192 if (L == 0 || R == 0) return 0;
193
194 switch (Op) {
195 case '+': return Builder.CreateAdd(L, R, "addtmp");
196 case '-': return Builder.CreateSub(L, R, "subtmp");
197 case '*': return Builder.CreateMul(L, R, "multmp");
198 case '&lt;':
Chris Lattner71155212007-11-06 01:39:12 +0000199 L = Builder.CreateFCmpULT(L, R, "cmptmp");
Chris Lattner2e902042007-10-22 07:01:42 +0000200 // Convert bool 0/1 to double 0.0 or 1.0
201 return Builder.CreateUIToFP(L, Type::DoubleTy, "booltmp");
202 default: return ErrorV("invalid binary operator");
203 }
204}
205</pre>
206</div>
207
Chris Lattnerd3f0cdd2007-10-23 04:51:30 +0000208<p>Binary operators start to get more interesting. The basic idea here is that
209we recursively emit code for the left-hand side of the expression, then the
210right-hand side, then we compute the result of the binary expression. In this
211code, we do a simple switch on the opcode to create the right LLVM instruction.
212</p>
Chris Lattner2e902042007-10-22 07:01:42 +0000213
Chris Lattner41fcea32007-11-13 07:06:30 +0000214<p>In the example above, the LLVM builder class is starting to show its value.
215LLVMBuilder knows where to insert the newly created instruction, all you have to
216do is specify what instruction to create (e.g. with <tt>CreateAdd</tt>), which
Chris Lattnerd3f0cdd2007-10-23 04:51:30 +0000217operands to use (<tt>L</tt> and <tt>R</tt> here) and optionally provide a name
218for the generated instruction. One nice thing about LLVM is that the name is
219just a hint: if there are multiple additions in a single function, the first
220will be named "addtmp" and the second will be "autorenamed" by adding a suffix,
221giving it a name like "addtmp42". Local value names for instructions are purely
222optional, but it makes it much easier to read the IR dumps.</p>
223
Chris Lattner41fcea32007-11-13 07:06:30 +0000224<p><a href="../LangRef.html#instref">LLVM instructions</a> are constrained by
Chris Lattner7badb2d2007-11-06 07:26:32 +0000225strict rules: for example, the Left and Right operators of
Chris Lattner41fcea32007-11-13 07:06:30 +0000226an <a href="../LangRef.html#i_add">add instruction</a> must have the same
227type, and the result type of the add must match the operand types. Because
Chris Lattner7badb2d2007-11-06 07:26:32 +0000228all values in Kaleidoscope are doubles, this makes for very simple code for add,
229sub and mul.</p>
Chris Lattnerd3f0cdd2007-10-23 04:51:30 +0000230
231<p>On the other hand, LLVM specifies that the <a
232href="../LangRef.html#i_fcmp">fcmp instruction</a> always returns an 'i1' value
Chris Lattner41fcea32007-11-13 07:06:30 +0000233(a one bit integer). The problem with this is that Kaleidoscope wants the value to be a 0.0 or 1.0 value. In order to get these semantics, we combine the fcmp instruction with
Chris Lattnerd3f0cdd2007-10-23 04:51:30 +0000234a <a href="../LangRef.html#i_uitofp">uitofp instruction</a>. This instruction
235converts its input integer into a floating point value by treating the input
236as an unsigned value. In contrast, if we used the <a
Chris Lattnerd96b1592007-11-07 05:07:10 +0000237href="../LangRef.html#i_sitofp">sitofp instruction</a>, the Kaleidoscope '&lt;'
Chris Lattnerd3f0cdd2007-10-23 04:51:30 +0000238operator would return 0.0 and -1.0, depending on the input value.</p>
Chris Lattner2e902042007-10-22 07:01:42 +0000239
240<div class="doc_code">
241<pre>
242Value *CallExprAST::Codegen() {
243 // Look up the name in the global module table.
244 Function *CalleeF = TheModule-&gt;getFunction(Callee);
245 if (CalleeF == 0)
246 return ErrorV("Unknown function referenced");
247
248 // If argument mismatch error.
249 if (CalleeF-&gt;arg_size() != Args.size())
250 return ErrorV("Incorrect # arguments passed");
251
252 std::vector&lt;Value*&gt; ArgsV;
253 for (unsigned i = 0, e = Args.size(); i != e; ++i) {
254 ArgsV.push_back(Args[i]-&gt;Codegen());
255 if (ArgsV.back() == 0) return 0;
256 }
257
258 return Builder.CreateCall(CalleeF, ArgsV.begin(), ArgsV.end(), "calltmp");
259}
260</pre>
261</div>
262
Chris Lattner41fcea32007-11-13 07:06:30 +0000263<p>Code generation for function calls is quite straightforward with LLVM. The
264code above initially does a function name lookup in the LLVM Module's symbol
Chris Lattnerd3f0cdd2007-10-23 04:51:30 +0000265table. Recall that the LLVM Module is the container that holds all of the
266functions we are JIT'ing. By giving each function the same name as what the
267user specifies, we can use the LLVM symbol table to resolve function names for
268us.</p>
269
270<p>Once we have the function to call, we recursively codegen each argument that
271is to be passed in, and create an LLVM <a href="../LangRef.html#i_call">call
272instruction</a>. Note that LLVM uses the native C calling conventions by
Chris Lattner41fcea32007-11-13 07:06:30 +0000273default, allowing these calls to also call into standard library functions like
274"sin" and "cos", with no additional effort.</p>
Chris Lattnerd3f0cdd2007-10-23 04:51:30 +0000275
276<p>This wraps up our handling of the four basic expressions that we have so far
277in Kaleidoscope. Feel free to go in and add some more. For example, by
278browsing the <a href="../LangRef.html">LLVM language reference</a> you'll find
279several other interesting instructions that are really easy to plug into our
280basic framework.</p>
Chris Lattner2e902042007-10-22 07:01:42 +0000281
282</div>
283
284<!-- *********************************************************************** -->
Chris Lattner35abbf52007-10-23 06:23:57 +0000285<div class="doc_section"><a name="funcs">Function Code Generation</a></div>
Chris Lattner2e902042007-10-22 07:01:42 +0000286<!-- *********************************************************************** -->
287
288<div class="doc_text">
289
Chris Lattnerd96b1592007-11-07 05:07:10 +0000290<p>Code generation for prototypes and functions must handle a number of
291details, which make their code less beautiful than expression code
292generation, but allows us to illustrate some important points. First, lets
293talk about code generation for prototypes: they are used both for function
294bodies and external function declarations. The code starts with:</p>
Chris Lattner35abbf52007-10-23 06:23:57 +0000295
296<div class="doc_code">
297<pre>
298Function *PrototypeAST::Codegen() {
299 // Make the function type: double(double,double) etc.
300 std::vector&lt;const Type*&gt; Doubles(Args.size(), Type::DoubleTy);
301 FunctionType *FT = FunctionType::get(Type::DoubleTy, Doubles, false);
302
303 Function *F = new Function(FT, Function::ExternalLinkage, Name, TheModule);
304</pre>
305</div>
306
Chris Lattnercf9893d2007-11-05 19:22:50 +0000307<p>This code packs a lot of power into a few lines. Note first that this
Chris Lattnerd96b1592007-11-07 05:07:10 +0000308function returns a "Function*" instead of a "Value*". Because a "prototype"
309really talks about the external interface for a function (not the value computed
310by an expression), it makes sense for it to return the LLVM Function it
311corresponds to when codegen'd.</p>
Chris Lattnercf9893d2007-11-05 19:22:50 +0000312
Chris Lattnerd96b1592007-11-07 05:07:10 +0000313<p>The call to <tt>FunctionType::get</tt> creates
Chris Lattner35abbf52007-10-23 06:23:57 +0000314the <tt>FunctionType</tt> that should be used for a given Prototype. Since all
315function arguments in Kaleidoscope are of type double, the first line creates
Chris Lattnerd96b1592007-11-07 05:07:10 +0000316a vector of "N" LLVM double types. It then uses the <tt>FunctionType::get</tt>
Chris Lattner35abbf52007-10-23 06:23:57 +0000317method to create a function type that takes "N" doubles as arguments, returns
318one double as a result, and that is not vararg (the false parameter indicates
319this). Note that Types in LLVM are uniqued just like Constants are, so you
320don't "new" a type, you "get" it.</p>
321
322<p>The final line above actually creates the function that the prototype will
Chris Lattner41fcea32007-11-13 07:06:30 +0000323correspond to. This indicates the type, linkage and name to use, as well as which
Chris Lattner35abbf52007-10-23 06:23:57 +0000324module to insert into. "<a href="LangRef.html#linkage">external linkage</a>"
325means that the function may be defined outside the current module and/or that it
326is callable by functions outside the module. The Name passed in is the name the
327user specified: since "<tt>TheModule</tt>" is specified, this name is registered
328in "<tt>TheModule</tt>"s symbol table, which is used by the function call code
329above.</p>
330
331<div class="doc_code">
332<pre>
333 // If F conflicted, there was already something named 'Name'. If it has a
334 // body, don't allow redefinition or reextern.
335 if (F-&gt;getName() != Name) {
336 // Delete the one we just made and get the existing one.
337 F-&gt;eraseFromParent();
338 F = TheModule-&gt;getFunction(Name);
339</pre>
340</div>
341
342<p>The Module symbol table works just like the Function symbol table when it
343comes to name conflicts: if a new function is created with a name was previously
344added to the symbol table, it will get implicitly renamed when added to the
Chris Lattner41fcea32007-11-13 07:06:30 +0000345Module. The code above exploits this fact to determine if there was a previous
Chris Lattner35abbf52007-10-23 06:23:57 +0000346definition of this function.</p>
347
348<p>In Kaleidoscope, I choose to allow redefinitions of functions in two cases:
Chris Lattnerd96b1592007-11-07 05:07:10 +0000349first, we want to allow 'extern'ing a function more than once, as long as the
Chris Lattner35abbf52007-10-23 06:23:57 +0000350prototypes for the externs match (since all arguments have the same type, we
351just have to check that the number of arguments match). Second, we want to
352allow 'extern'ing a function and then definining a body for it. This is useful
353when defining mutually recursive functions.</p>
354
355<p>In order to implement this, the code above first checks to see if there is
356a collision on the name of the function. If so, it deletes the function we just
357created (by calling <tt>eraseFromParent</tt>) and then calling
358<tt>getFunction</tt> to get the existing function with the specified name. Note
359that many APIs in LLVM have "erase" forms and "remove" forms. The "remove" form
360unlinks the object from its parent (e.g. a Function from a Module) and returns
361it. The "erase" form unlinks the object and then deletes it.</p>
362
363<div class="doc_code">
364<pre>
365 // If F already has a body, reject this.
366 if (!F-&gt;empty()) {
367 ErrorF("redefinition of function");
368 return 0;
369 }
370
371 // If F took a different number of args, reject.
372 if (F-&gt;arg_size() != Args.size()) {
373 ErrorF("redefinition of function with different # args");
374 return 0;
375 }
376 }
377</pre>
378</div>
379
Chris Lattnerd96b1592007-11-07 05:07:10 +0000380<p>In order to verify the logic above, we first check to see if the pre-existing
Chris Lattner35abbf52007-10-23 06:23:57 +0000381function is "empty". In this case, empty means that it has no basic blocks in
Chris Lattnerd96b1592007-11-07 05:07:10 +0000382it, which means it has no body. If it has no body, it is a forward
Chris Lattner35abbf52007-10-23 06:23:57 +0000383declaration. Since we don't allow anything after a full definition of the
384function, the code rejects this case. If the previous reference to a function
385was an 'extern', we simply verify that the number of arguments for that
386definition and this one match up. If not, we emit an error.</p>
387
388<div class="doc_code">
389<pre>
390 // Set names for all arguments.
391 unsigned Idx = 0;
392 for (Function::arg_iterator AI = F-&gt;arg_begin(); Idx != Args.size();
393 ++AI, ++Idx) {
394 AI-&gt;setName(Args[Idx]);
395
396 // Add arguments to variable symbol table.
397 NamedValues[Args[Idx]] = AI;
398 }
399 return F;
400}
401</pre>
402</div>
403
404<p>The last bit of code for prototypes loops over all of the arguments in the
Chris Lattner41fcea32007-11-13 07:06:30 +0000405function, setting the name of the LLVM Argument objects to match, and registering
Chris Lattner35abbf52007-10-23 06:23:57 +0000406the arguments in the <tt>NamedValues</tt> map for future use by the
407<tt>VariableExprAST</tt> AST node. Once this is set up, it returns the Function
408object to the caller. Note that we don't check for conflicting
409argument names here (e.g. "extern foo(a b a)"). Doing so would be very
Chris Lattnerd96b1592007-11-07 05:07:10 +0000410straight-forward with the mechanics we have already used above.</p>
Chris Lattner35abbf52007-10-23 06:23:57 +0000411
412<div class="doc_code">
413<pre>
414Function *FunctionAST::Codegen() {
415 NamedValues.clear();
416
417 Function *TheFunction = Proto-&gt;Codegen();
418 if (TheFunction == 0)
419 return 0;
420</pre>
421</div>
422
Chris Lattner41fcea32007-11-13 07:06:30 +0000423<p>Code generation for function definitions starts out simply enough: we just
424codegen the prototype (Proto) and verify that it is ok. We then clear out the
Chris Lattner35abbf52007-10-23 06:23:57 +0000425<tt>NamedValues</tt> map to make sure that there isn't anything in it from the
Chris Lattnera1cd2242007-11-06 05:07:30 +0000426last function we compiled. Code generation of the prototype ensures that there
427is an LLVM Function object that is ready to go for us.</p>
Chris Lattner35abbf52007-10-23 06:23:57 +0000428
429<div class="doc_code">
430<pre>
431 // Create a new basic block to start insertion into.
432 BasicBlock *BB = new BasicBlock("entry", TheFunction);
433 Builder.SetInsertPoint(BB);
434
435 if (Value *RetVal = Body-&gt;Codegen()) {
Chris Lattner35abbf52007-10-23 06:23:57 +0000436</pre>
437</div>
438
439<p>Now we get to the point where the <tt>Builder</tt> is set up. The first
440line creates a new <a href="http://en.wikipedia.org/wiki/Basic_block">basic
441block</a> (named "entry"), which is inserted into <tt>TheFunction</tt>. The
442second line then tells the builder that new instructions should be inserted into
443the end of the new basic block. Basic blocks in LLVM are an important part
444of functions that define the <a
445href="http://en.wikipedia.org/wiki/Control_flow_graph">Control Flow Graph</a>.
446Since we don't have any control flow, our functions will only contain one
Chris Lattner41fcea32007-11-13 07:06:30 +0000447block at this point. We'll fix this in <a href="LangImpl5.html">Chapter 5</a> :).</p>
Chris Lattner35abbf52007-10-23 06:23:57 +0000448
Chris Lattnerd9b86162007-10-25 04:30:35 +0000449<div class="doc_code">
450<pre>
451 if (Value *RetVal = Body-&gt;Codegen()) {
452 // Finish off the function.
453 Builder.CreateRet(RetVal);
454
455 // Validate the generated code, checking for consistency.
456 verifyFunction(*TheFunction);
457 return TheFunction;
458 }
459</pre>
460</div>
461
Chris Lattner35abbf52007-10-23 06:23:57 +0000462<p>Once the insertion point is set up, we call the <tt>CodeGen()</tt> method for
463the root expression of the function. If no error happens, this emits code to
464compute the expression into the entry block and returns the value that was
465computed. Assuming no error, we then create an LLVM <a
Chris Lattnerd9b86162007-10-25 04:30:35 +0000466href="../LangRef.html#i_ret">ret instruction</a>, which completes the function.
Chris Lattner41fcea32007-11-13 07:06:30 +0000467Once the function is built, we call <tt>verifyFunction</tt>, which
Chris Lattnerd9b86162007-10-25 04:30:35 +0000468is provided by LLVM. This function does a variety of consistency checks on the
469generated code, to determine if our compiler is doing everything right. Using
470this is important: it can catch a lot of bugs. Once the function is finished
471and validated, we return it.</p>
Chris Lattner35abbf52007-10-23 06:23:57 +0000472
473<div class="doc_code">
474<pre>
475 // Error reading body, remove function.
476 TheFunction-&gt;eraseFromParent();
477 return 0;
478}
479</pre>
480</div>
481
482<p>The only piece left here is handling of the error case. For simplicity, we
Chris Lattner41fcea32007-11-13 07:06:30 +0000483handle this by merely deleting the function we produced with the
Chris Lattner35abbf52007-10-23 06:23:57 +0000484<tt>eraseFromParent</tt> method. This allows the user to redefine a function
485that they incorrectly typed in before: if we didn't delete it, it would live in
486the symbol table, with a body, preventing future redefinition.</p>
487
Chris Lattner41fcea32007-11-13 07:06:30 +0000488<p>This code does have a bug, though. Since the <tt>PrototypeAST::Codegen</tt>
489can return a previously defined forward declaration, our code can actually delete
Chris Lattner35abbf52007-10-23 06:23:57 +0000490a forward declaration. There are a number of ways to fix this bug, see what you
491can come up with! Here is a testcase:</p>
492
493<div class="doc_code">
494<pre>
495extern foo(a b); # ok, defines foo.
496def foo(a b) c; # error, 'c' is invalid.
497def bar() foo(1, 2); # error, unknown function "foo"
498</pre>
499</div>
500
501</div>
502
503<!-- *********************************************************************** -->
504<div class="doc_section"><a name="driver">Driver Changes and
505Closing Thoughts</a></div>
506<!-- *********************************************************************** -->
507
508<div class="doc_text">
509
510<p>
511For now, code generation to LLVM doesn't really get us much, except that we can
512look at the pretty IR calls. The sample code inserts calls to Codegen into the
513"<tt>HandleDefinition</tt>", "<tt>HandleExtern</tt>" etc functions, and then
514dumps out the LLVM IR. This gives a nice way to look at the LLVM IR for simple
515functions. For example:
516</p>
517
518<div class="doc_code">
519<pre>
520ready> <b>4+5</b>;
Chris Lattnerd96b1592007-11-07 05:07:10 +0000521Read top-level expression:
Chris Lattner35abbf52007-10-23 06:23:57 +0000522define double @""() {
523entry:
524 %addtmp = add double 4.000000e+00, 5.000000e+00
525 ret double %addtmp
526}
527</pre>
528</div>
529
530<p>Note how the parser turns the top-level expression into anonymous functions
Chris Lattnerd96b1592007-11-07 05:07:10 +0000531for us. This will be handy when we add <a href="LangImpl4.html#jit">JIT
532support</a> in the next chapter. Also note that the code is very literally
533transcribed, no optimizations are being performed. We will
534<a href="LangImpl4.html#trivialconstfold">add optimizations</a> explicitly in
535the next chapter.</p>
Chris Lattner35abbf52007-10-23 06:23:57 +0000536
537<div class="doc_code">
538<pre>
539ready&gt; <b>def foo(a b) a*a + 2*a*b + b*b;</b>
Chris Lattnerd96b1592007-11-07 05:07:10 +0000540Read function definition:
Chris Lattner35abbf52007-10-23 06:23:57 +0000541define double @foo(double %a, double %b) {
542entry:
543 %multmp = mul double %a, %a
544 %multmp1 = mul double 2.000000e+00, %a
545 %multmp2 = mul double %multmp1, %b
546 %addtmp = add double %multmp, %multmp2
547 %multmp3 = mul double %b, %b
548 %addtmp4 = add double %addtmp, %multmp3
549 ret double %addtmp4
550}
551</pre>
552</div>
553
554<p>This shows some simple arithmetic. Notice the striking similarity to the
555LLVM builder calls that we use to create the instructions.</p>
556
557<div class="doc_code">
558<pre>
559ready&gt; <b>def bar(a) foo(a, 4.0) + bar(31337);</b>
Chris Lattnerd96b1592007-11-07 05:07:10 +0000560Read function definition:
Chris Lattner35abbf52007-10-23 06:23:57 +0000561define double @bar(double %a) {
562entry:
563 %calltmp = call double @foo( double %a, double 4.000000e+00 )
564 %calltmp1 = call double @bar( double 3.133700e+04 )
565 %addtmp = add double %calltmp, %calltmp1
566 ret double %addtmp
567}
568</pre>
569</div>
570
Chris Lattnerfc60ab02007-11-05 17:39:26 +0000571<p>This shows some function calls. Note that this function will take a long
572time to execute if you call it. In the future we'll add conditional control
Chris Lattner41fcea32007-11-13 07:06:30 +0000573flow to actually make recursion useful :).</p>
Chris Lattner35abbf52007-10-23 06:23:57 +0000574
575<div class="doc_code">
576<pre>
577ready&gt; <b>extern cos(x);</b>
Chris Lattnerd96b1592007-11-07 05:07:10 +0000578Read extern:
Chris Lattner35abbf52007-10-23 06:23:57 +0000579declare double @cos(double)
580
581ready&gt; <b>cos(1.234);</b>
Chris Lattnerd96b1592007-11-07 05:07:10 +0000582Read top-level expression:
Chris Lattner35abbf52007-10-23 06:23:57 +0000583define double @""() {
584entry:
Chris Lattner8eef4b22007-10-23 06:30:50 +0000585 %calltmp = call double @cos( double 1.234000e+00 )
Chris Lattner35abbf52007-10-23 06:23:57 +0000586 ret double %calltmp
587}
588</pre>
589</div>
590
591<p>This shows an extern for the libm "cos" function, and a call to it.</p>
592
593
594<div class="doc_code">
595<pre>
596ready&gt; <b>^D</b>
597; ModuleID = 'my cool jit'
598
599define double @""() {
600entry:
601 %addtmp = add double 4.000000e+00, 5.000000e+00
602 ret double %addtmp
603}
604
605define double @foo(double %a, double %b) {
606entry:
607 %multmp = mul double %a, %a
608 %multmp1 = mul double 2.000000e+00, %a
609 %multmp2 = mul double %multmp1, %b
610 %addtmp = add double %multmp, %multmp2
611 %multmp3 = mul double %b, %b
612 %addtmp4 = add double %addtmp, %multmp3
613 ret double %addtmp4
614}
615
616define double @bar(double %a) {
617entry:
618 %calltmp = call double @foo( double %a, double 4.000000e+00 )
619 %calltmp1 = call double @bar( double 3.133700e+04 )
620 %addtmp = add double %calltmp, %calltmp1
621 ret double %addtmp
622}
623
624declare double @cos(double)
625
626define double @""() {
627entry:
628 %calltmp = call double @cos( double 1.234000e+00 )
629 ret double %calltmp
630}
631</pre>
632</div>
633
634<p>When you quit the current demo, it dumps out the IR for the entire module
635generated. Here you can see the big picture with all the functions referencing
636each other.</p>
637
Chris Lattner41fcea32007-11-13 07:06:30 +0000638<p>This wraps up the third chapter of the Kaleidoscope tutorial. Up next, we'll
Chris Lattner35abbf52007-10-23 06:23:57 +0000639describe how to <a href="LangImpl4.html">add JIT codegen and optimizer
640support</a> to this so we can actually start running code!</p>
641
642</div>
643
644
645<!-- *********************************************************************** -->
646<div class="doc_section"><a name="code">Full Code Listing</a></div>
647<!-- *********************************************************************** -->
648
649<div class="doc_text">
650
651<p>
652Here is the complete code listing for our running example, enhanced with the
653LLVM code generator. Because this uses the LLVM libraries, we need to link
654them in. To do this, we use the <a
655href="http://llvm.org/cmds/llvm-config.html">llvm-config</a> tool to inform
656our makefile/command line about which options to use:</p>
657
658<div class="doc_code">
659<pre>
660 # Compile
Chris Lattnerd96b1592007-11-07 05:07:10 +0000661 g++ -g -O3 toy.cpp `llvm-config --cppflags --ldflags --libs core` -o toy
Chris Lattner35abbf52007-10-23 06:23:57 +0000662 # Run
663 ./toy
664</pre>
665</div>
666
667<p>Here is the code:</p>
668
Chris Lattner2e902042007-10-22 07:01:42 +0000669<div class="doc_code">
670<pre>
671// To build this:
Chris Lattner2e902042007-10-22 07:01:42 +0000672// See example below.
673
674#include "llvm/DerivedTypes.h"
675#include "llvm/Module.h"
Chris Lattnerd9b86162007-10-25 04:30:35 +0000676#include "llvm/Analysis/Verifier.h"
Chris Lattner2e902042007-10-22 07:01:42 +0000677#include "llvm/Support/LLVMBuilder.h"
678#include &lt;cstdio&gt;
679#include &lt;string&gt;
680#include &lt;map&gt;
681#include &lt;vector&gt;
682using namespace llvm;
683
684//===----------------------------------------------------------------------===//
685// Lexer
686//===----------------------------------------------------------------------===//
687
688// The lexer returns tokens [0-255] if it is an unknown character, otherwise one
689// of these for known things.
690enum Token {
691 tok_eof = -1,
692
693 // commands
694 tok_def = -2, tok_extern = -3,
695
696 // primary
697 tok_identifier = -4, tok_number = -5,
698};
699
700static std::string IdentifierStr; // Filled in if tok_identifier
701static double NumVal; // Filled in if tok_number
702
703/// gettok - Return the next token from standard input.
704static int gettok() {
705 static int LastChar = ' ';
706
707 // Skip any whitespace.
708 while (isspace(LastChar))
709 LastChar = getchar();
710
711 if (isalpha(LastChar)) { // identifier: [a-zA-Z][a-zA-Z0-9]*
712 IdentifierStr = LastChar;
713 while (isalnum((LastChar = getchar())))
714 IdentifierStr += LastChar;
715
716 if (IdentifierStr == "def") return tok_def;
717 if (IdentifierStr == "extern") return tok_extern;
718 return tok_identifier;
719 }
720
721 if (isdigit(LastChar) || LastChar == '.') { // Number: [0-9.]+
722 std::string NumStr;
723 do {
724 NumStr += LastChar;
725 LastChar = getchar();
726 } while (isdigit(LastChar) || LastChar == '.');
727
728 NumVal = strtod(NumStr.c_str(), 0);
729 return tok_number;
730 }
731
732 if (LastChar == '#') {
733 // Comment until end of line.
734 do LastChar = getchar();
735 while (LastChar != EOF &amp;&amp; LastChar != '\n' &amp; LastChar != '\r');
736
737 if (LastChar != EOF)
738 return gettok();
739 }
740
741 // Check for end of file. Don't eat the EOF.
742 if (LastChar == EOF)
743 return tok_eof;
744
745 // Otherwise, just return the character as its ascii value.
746 int ThisChar = LastChar;
747 LastChar = getchar();
748 return ThisChar;
749}
750
751//===----------------------------------------------------------------------===//
752// Abstract Syntax Tree (aka Parse Tree)
753//===----------------------------------------------------------------------===//
754
755/// ExprAST - Base class for all expression nodes.
756class ExprAST {
757public:
758 virtual ~ExprAST() {}
759 virtual Value *Codegen() = 0;
760};
761
762/// NumberExprAST - Expression class for numeric literals like "1.0".
763class NumberExprAST : public ExprAST {
764 double Val;
765public:
Chris Lattner28571ed2007-10-23 04:27:44 +0000766 explicit NumberExprAST(double val) : Val(val) {}
Chris Lattner2e902042007-10-22 07:01:42 +0000767 virtual Value *Codegen();
768};
769
770/// VariableExprAST - Expression class for referencing a variable, like "a".
771class VariableExprAST : public ExprAST {
772 std::string Name;
773public:
Chris Lattner28571ed2007-10-23 04:27:44 +0000774 explicit VariableExprAST(const std::string &amp;name) : Name(name) {}
Chris Lattner2e902042007-10-22 07:01:42 +0000775 virtual Value *Codegen();
776};
777
778/// BinaryExprAST - Expression class for a binary operator.
779class BinaryExprAST : public ExprAST {
780 char Op;
781 ExprAST *LHS, *RHS;
782public:
783 BinaryExprAST(char op, ExprAST *lhs, ExprAST *rhs)
784 : Op(op), LHS(lhs), RHS(rhs) {}
785 virtual Value *Codegen();
786};
787
788/// CallExprAST - Expression class for function calls.
789class CallExprAST : public ExprAST {
790 std::string Callee;
791 std::vector&lt;ExprAST*&gt; Args;
792public:
793 CallExprAST(const std::string &amp;callee, std::vector&lt;ExprAST*&gt; &amp;args)
794 : Callee(callee), Args(args) {}
795 virtual Value *Codegen();
796};
797
798/// PrototypeAST - This class represents the "prototype" for a function,
799/// which captures its argument names as well as if it is an operator.
800class PrototypeAST {
801 std::string Name;
802 std::vector&lt;std::string&gt; Args;
803public:
804 PrototypeAST(const std::string &amp;name, const std::vector&lt;std::string&gt; &amp;args)
805 : Name(name), Args(args) {}
806
807 Function *Codegen();
808};
809
810/// FunctionAST - This class represents a function definition itself.
811class FunctionAST {
812 PrototypeAST *Proto;
813 ExprAST *Body;
814public:
815 FunctionAST(PrototypeAST *proto, ExprAST *body)
816 : Proto(proto), Body(body) {}
817
818 Function *Codegen();
819};
820
821//===----------------------------------------------------------------------===//
822// Parser
823//===----------------------------------------------------------------------===//
824
825/// CurTok/getNextToken - Provide a simple token buffer. CurTok is the current
826/// token the parser it looking at. getNextToken reads another token from the
827/// lexer and updates CurTok with its results.
828static int CurTok;
829static int getNextToken() {
830 return CurTok = gettok();
831}
832
833/// BinopPrecedence - This holds the precedence for each binary operator that is
834/// defined.
835static std::map&lt;char, int&gt; BinopPrecedence;
836
837/// GetTokPrecedence - Get the precedence of the pending binary operator token.
838static int GetTokPrecedence() {
839 if (!isascii(CurTok))
840 return -1;
841
842 // Make sure it's a declared binop.
843 int TokPrec = BinopPrecedence[CurTok];
844 if (TokPrec &lt;= 0) return -1;
845 return TokPrec;
846}
847
848/// Error* - These are little helper functions for error handling.
849ExprAST *Error(const char *Str) { fprintf(stderr, "Error: %s\n", Str);return 0;}
850PrototypeAST *ErrorP(const char *Str) { Error(Str); return 0; }
851FunctionAST *ErrorF(const char *Str) { Error(Str); return 0; }
852
853static ExprAST *ParseExpression();
854
855/// identifierexpr
Chris Lattner20a0c802007-11-05 17:54:34 +0000856/// ::= identifier
857/// ::= identifier '(' expression* ')'
Chris Lattner2e902042007-10-22 07:01:42 +0000858static ExprAST *ParseIdentifierExpr() {
859 std::string IdName = IdentifierStr;
860
Chris Lattner20a0c802007-11-05 17:54:34 +0000861 getNextToken(); // eat identifier.
Chris Lattner2e902042007-10-22 07:01:42 +0000862
863 if (CurTok != '(') // Simple variable ref.
864 return new VariableExprAST(IdName);
865
866 // Call.
867 getNextToken(); // eat (
868 std::vector&lt;ExprAST*&gt; Args;
Chris Lattner71155212007-11-06 01:39:12 +0000869 if (CurTok != ')') {
870 while (1) {
871 ExprAST *Arg = ParseExpression();
872 if (!Arg) return 0;
873 Args.push_back(Arg);
Chris Lattner2e902042007-10-22 07:01:42 +0000874
Chris Lattner71155212007-11-06 01:39:12 +0000875 if (CurTok == ')') break;
Chris Lattner2e902042007-10-22 07:01:42 +0000876
Chris Lattner71155212007-11-06 01:39:12 +0000877 if (CurTok != ',')
878 return Error("Expected ')'");
879 getNextToken();
880 }
Chris Lattner2e902042007-10-22 07:01:42 +0000881 }
882
883 // Eat the ')'.
884 getNextToken();
885
886 return new CallExprAST(IdName, Args);
887}
888
889/// numberexpr ::= number
890static ExprAST *ParseNumberExpr() {
891 ExprAST *Result = new NumberExprAST(NumVal);
892 getNextToken(); // consume the number
893 return Result;
894}
895
896/// parenexpr ::= '(' expression ')'
897static ExprAST *ParseParenExpr() {
898 getNextToken(); // eat (.
899 ExprAST *V = ParseExpression();
900 if (!V) return 0;
901
902 if (CurTok != ')')
903 return Error("expected ')'");
904 getNextToken(); // eat ).
905 return V;
906}
907
908/// primary
909/// ::= identifierexpr
910/// ::= numberexpr
911/// ::= parenexpr
912static ExprAST *ParsePrimary() {
913 switch (CurTok) {
914 default: return Error("unknown token when expecting an expression");
915 case tok_identifier: return ParseIdentifierExpr();
916 case tok_number: return ParseNumberExpr();
917 case '(': return ParseParenExpr();
918 }
919}
920
921/// binoprhs
922/// ::= ('+' primary)*
923static ExprAST *ParseBinOpRHS(int ExprPrec, ExprAST *LHS) {
924 // If this is a binop, find its precedence.
925 while (1) {
926 int TokPrec = GetTokPrecedence();
927
928 // If this is a binop that binds at least as tightly as the current binop,
929 // consume it, otherwise we are done.
930 if (TokPrec &lt; ExprPrec)
931 return LHS;
932
933 // Okay, we know this is a binop.
934 int BinOp = CurTok;
935 getNextToken(); // eat binop
936
937 // Parse the primary expression after the binary operator.
938 ExprAST *RHS = ParsePrimary();
939 if (!RHS) return 0;
940
941 // If BinOp binds less tightly with RHS than the operator after RHS, let
942 // the pending operator take RHS as its LHS.
943 int NextPrec = GetTokPrecedence();
944 if (TokPrec &lt; NextPrec) {
945 RHS = ParseBinOpRHS(TokPrec+1, RHS);
946 if (RHS == 0) return 0;
947 }
948
949 // Merge LHS/RHS.
950 LHS = new BinaryExprAST(BinOp, LHS, RHS);
951 }
952}
953
954/// expression
955/// ::= primary binoprhs
956///
957static ExprAST *ParseExpression() {
958 ExprAST *LHS = ParsePrimary();
959 if (!LHS) return 0;
960
961 return ParseBinOpRHS(0, LHS);
962}
963
964/// prototype
965/// ::= id '(' id* ')'
966static PrototypeAST *ParsePrototype() {
967 if (CurTok != tok_identifier)
968 return ErrorP("Expected function name in prototype");
969
970 std::string FnName = IdentifierStr;
971 getNextToken();
972
973 if (CurTok != '(')
974 return ErrorP("Expected '(' in prototype");
975
976 std::vector&lt;std::string&gt; ArgNames;
977 while (getNextToken() == tok_identifier)
978 ArgNames.push_back(IdentifierStr);
979 if (CurTok != ')')
980 return ErrorP("Expected ')' in prototype");
981
982 // success.
983 getNextToken(); // eat ')'.
984
985 return new PrototypeAST(FnName, ArgNames);
986}
987
988/// definition ::= 'def' prototype expression
989static FunctionAST *ParseDefinition() {
990 getNextToken(); // eat def.
991 PrototypeAST *Proto = ParsePrototype();
992 if (Proto == 0) return 0;
993
994 if (ExprAST *E = ParseExpression())
995 return new FunctionAST(Proto, E);
996 return 0;
997}
998
999/// toplevelexpr ::= expression
1000static FunctionAST *ParseTopLevelExpr() {
1001 if (ExprAST *E = ParseExpression()) {
1002 // Make an anonymous proto.
1003 PrototypeAST *Proto = new PrototypeAST("", std::vector&lt;std::string&gt;());
1004 return new FunctionAST(Proto, E);
1005 }
1006 return 0;
1007}
1008
1009/// external ::= 'extern' prototype
1010static PrototypeAST *ParseExtern() {
1011 getNextToken(); // eat extern.
1012 return ParsePrototype();
1013}
1014
1015//===----------------------------------------------------------------------===//
1016// Code Generation
1017//===----------------------------------------------------------------------===//
1018
1019static Module *TheModule;
1020static LLVMBuilder Builder;
1021static std::map&lt;std::string, Value*&gt; NamedValues;
1022
1023Value *ErrorV(const char *Str) { Error(Str); return 0; }
1024
1025Value *NumberExprAST::Codegen() {
1026 return ConstantFP::get(Type::DoubleTy, APFloat(Val));
1027}
1028
1029Value *VariableExprAST::Codegen() {
1030 // Look this variable up in the function.
1031 Value *V = NamedValues[Name];
1032 return V ? V : ErrorV("Unknown variable name");
1033}
1034
1035Value *BinaryExprAST::Codegen() {
1036 Value *L = LHS-&gt;Codegen();
1037 Value *R = RHS-&gt;Codegen();
1038 if (L == 0 || R == 0) return 0;
1039
1040 switch (Op) {
1041 case '+': return Builder.CreateAdd(L, R, "addtmp");
1042 case '-': return Builder.CreateSub(L, R, "subtmp");
1043 case '*': return Builder.CreateMul(L, R, "multmp");
1044 case '&lt;':
Chris Lattner71155212007-11-06 01:39:12 +00001045 L = Builder.CreateFCmpULT(L, R, "cmptmp");
Chris Lattner2e902042007-10-22 07:01:42 +00001046 // Convert bool 0/1 to double 0.0 or 1.0
1047 return Builder.CreateUIToFP(L, Type::DoubleTy, "booltmp");
1048 default: return ErrorV("invalid binary operator");
1049 }
1050}
1051
1052Value *CallExprAST::Codegen() {
1053 // Look up the name in the global module table.
1054 Function *CalleeF = TheModule-&gt;getFunction(Callee);
1055 if (CalleeF == 0)
1056 return ErrorV("Unknown function referenced");
1057
1058 // If argument mismatch error.
1059 if (CalleeF-&gt;arg_size() != Args.size())
1060 return ErrorV("Incorrect # arguments passed");
1061
1062 std::vector&lt;Value*&gt; ArgsV;
1063 for (unsigned i = 0, e = Args.size(); i != e; ++i) {
1064 ArgsV.push_back(Args[i]-&gt;Codegen());
1065 if (ArgsV.back() == 0) return 0;
1066 }
1067
1068 return Builder.CreateCall(CalleeF, ArgsV.begin(), ArgsV.end(), "calltmp");
1069}
1070
1071Function *PrototypeAST::Codegen() {
1072 // Make the function type: double(double,double) etc.
Chris Lattner35abbf52007-10-23 06:23:57 +00001073 std::vector&lt;const Type*&gt; Doubles(Args.size(), Type::DoubleTy);
1074 FunctionType *FT = FunctionType::get(Type::DoubleTy, Doubles, false);
Chris Lattner2e902042007-10-22 07:01:42 +00001075
1076 Function *F = new Function(FT, Function::ExternalLinkage, Name, TheModule);
1077
1078 // If F conflicted, there was already something named 'Name'. If it has a
1079 // body, don't allow redefinition or reextern.
1080 if (F-&gt;getName() != Name) {
1081 // Delete the one we just made and get the existing one.
1082 F-&gt;eraseFromParent();
1083 F = TheModule-&gt;getFunction(Name);
1084
1085 // If F already has a body, reject this.
1086 if (!F-&gt;empty()) {
1087 ErrorF("redefinition of function");
1088 return 0;
1089 }
1090
1091 // If F took a different number of args, reject.
1092 if (F-&gt;arg_size() != Args.size()) {
1093 ErrorF("redefinition of function with different # args");
1094 return 0;
1095 }
1096 }
1097
1098 // Set names for all arguments.
1099 unsigned Idx = 0;
1100 for (Function::arg_iterator AI = F-&gt;arg_begin(); Idx != Args.size();
1101 ++AI, ++Idx) {
1102 AI-&gt;setName(Args[Idx]);
1103
1104 // Add arguments to variable symbol table.
1105 NamedValues[Args[Idx]] = AI;
1106 }
1107
1108 return F;
1109}
1110
1111Function *FunctionAST::Codegen() {
1112 NamedValues.clear();
1113
1114 Function *TheFunction = Proto-&gt;Codegen();
1115 if (TheFunction == 0)
1116 return 0;
1117
1118 // Create a new basic block to start insertion into.
Chris Lattner35abbf52007-10-23 06:23:57 +00001119 BasicBlock *BB = new BasicBlock("entry", TheFunction);
1120 Builder.SetInsertPoint(BB);
Chris Lattner2e902042007-10-22 07:01:42 +00001121
1122 if (Value *RetVal = Body-&gt;Codegen()) {
1123 // Finish off the function.
1124 Builder.CreateRet(RetVal);
Chris Lattnerd9b86162007-10-25 04:30:35 +00001125
1126 // Validate the generated code, checking for consistency.
1127 verifyFunction(*TheFunction);
Chris Lattner2e902042007-10-22 07:01:42 +00001128 return TheFunction;
1129 }
1130
1131 // Error reading body, remove function.
1132 TheFunction-&gt;eraseFromParent();
1133 return 0;
1134}
1135
1136//===----------------------------------------------------------------------===//
1137// Top-Level parsing and JIT Driver
1138//===----------------------------------------------------------------------===//
1139
1140static void HandleDefinition() {
1141 if (FunctionAST *F = ParseDefinition()) {
1142 if (Function *LF = F-&gt;Codegen()) {
1143 fprintf(stderr, "Read function definition:");
1144 LF-&gt;dump();
1145 }
1146 } else {
1147 // Skip token for error recovery.
1148 getNextToken();
1149 }
1150}
1151
1152static void HandleExtern() {
1153 if (PrototypeAST *P = ParseExtern()) {
1154 if (Function *F = P-&gt;Codegen()) {
1155 fprintf(stderr, "Read extern: ");
1156 F-&gt;dump();
1157 }
1158 } else {
1159 // Skip token for error recovery.
1160 getNextToken();
1161 }
1162}
1163
1164static void HandleTopLevelExpression() {
1165 // Evaluate a top level expression into an anonymous function.
1166 if (FunctionAST *F = ParseTopLevelExpr()) {
1167 if (Function *LF = F-&gt;Codegen()) {
1168 fprintf(stderr, "Read top-level expression:");
1169 LF-&gt;dump();
1170 }
1171 } else {
1172 // Skip token for error recovery.
1173 getNextToken();
1174 }
1175}
1176
1177/// top ::= definition | external | expression | ';'
1178static void MainLoop() {
1179 while (1) {
1180 fprintf(stderr, "ready&gt; ");
1181 switch (CurTok) {
1182 case tok_eof: return;
1183 case ';': getNextToken(); break; // ignore top level semicolons.
1184 case tok_def: HandleDefinition(); break;
1185 case tok_extern: HandleExtern(); break;
1186 default: HandleTopLevelExpression(); break;
1187 }
1188 }
1189}
1190
1191
1192
1193//===----------------------------------------------------------------------===//
1194// "Library" functions that can be "extern'd" from user code.
1195//===----------------------------------------------------------------------===//
1196
1197/// putchard - putchar that takes a double and returns 0.
1198extern "C"
1199double putchard(double X) {
1200 putchar((char)X);
1201 return 0;
1202}
1203
1204//===----------------------------------------------------------------------===//
1205// Main driver code.
1206//===----------------------------------------------------------------------===//
1207
1208int main() {
1209 TheModule = new Module("my cool jit");
1210
1211 // Install standard binary operators.
1212 // 1 is lowest precedence.
1213 BinopPrecedence['&lt;'] = 10;
1214 BinopPrecedence['+'] = 20;
1215 BinopPrecedence['-'] = 20;
1216 BinopPrecedence['*'] = 40; // highest.
1217
1218 // Prime the first token.
1219 fprintf(stderr, "ready&gt; ");
1220 getNextToken();
1221
1222 MainLoop();
1223 TheModule-&gt;dump();
1224 return 0;
1225}
Chris Lattner2e902042007-10-22 07:01:42 +00001226</pre>
1227</div>
1228</div>
1229
1230<!-- *********************************************************************** -->
1231<hr>
1232<address>
1233 <a href="http://jigsaw.w3.org/css-validator/check/referer"><img
1234 src="http://jigsaw.w3.org/css-validator/images/vcss" alt="Valid CSS!"></a>
1235 <a href="http://validator.w3.org/check/referer"><img
Chris Lattner8eef4b22007-10-23 06:30:50 +00001236 src="http://www.w3.org/Icons/valid-html401" alt="Valid HTML 4.01!"></a>
Chris Lattner2e902042007-10-22 07:01:42 +00001237
1238 <a href="mailto:sabre@nondot.org">Chris Lattner</a><br>
1239 <a href="http://llvm.org">The LLVM Compiler Infrastructure</a><br>
1240 Last modified: $Date: 2007-10-17 11:05:13 -0700 (Wed, 17 Oct 2007) $
1241</address>
1242</body>
1243</html>