Blame - docs/tutorial/LangImpl7.html - fp2-dev/platform/external/llvm

blob: 6664c7c5802acbcf4855ae2bd812cdda4a245dc0 [file] [log] [blame]

Chris Lattner	00c992d	2007-11-03 08:55:29 +0000	[diff] [blame]	1	<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01//EN"
				2	"http://www.w3.org/TR/html4/strict.dtd">
				3
				4	<html>
				5	<head>
				6	<title>Kaleidoscope: Extending the Language: Mutable Variables / SSA
				7	construction</title>
				8	<meta http-equiv="Content-Type" content="text/html; charset=utf-8">
				9	<meta name="author" content="Chris Lattner">
				10	<link rel="stylesheet" href="../llvm.css" type="text/css">
				11	</head>
				12
				13	<body>
				14
				15	<div class="doc_title">Kaleidoscope: Extending the Language: Mutable Variables</div>
				16
				17	<div class="doc_author">
				18	<p>Written by <a href="mailto:sabre@nondot.org">Chris Lattner</a></p>
				19	</div>
				20
				21	<!-- *********************************************************************** -->
				22	<div class="doc_section"><a name="intro">Part 7 Introduction</a></div>
				23	<!-- *********************************************************************** -->
				24
				25	<div class="doc_text">
				26
				27	<p>Welcome to Part 7 of the "<a href="index.html">Implementing a language with
				28	LLVM</a>" tutorial. In parts 1 through 6, we've built a very respectable,
				29	albeit simple, <a
				30	href="http://en.wikipedia.org/wiki/Functional_programming">functional
				31	programming language</a>. In our journey, we learned some parsing techniques,
				32	how to build and represent an AST, how to build LLVM IR, and how to optimize
				33	the resultant code and JIT compile it.</p>
				34
				35	<p>While Kaleidoscope is interesting as a functional language, this makes it
				36	"too easy" to generate LLVM IR for it. In particular, a functional language
				37	makes it very easy to build LLVM IR directly in <a
				38	href="http://en.wikipedia.org/wiki/Static_single_assignment_form">SSA form</a>.
				39	Since LLVM requires that the input code be in SSA form, this is a very nice
				40	property and it is often unclear to newcomers how to generate code for an
				41	imperative language with mutable variables.</p>
				42
				43	<p>The short (and happy) summary of this chapter is that there is no need for
				44	your front-end to build SSA form: LLVM provides highly tuned and well tested
				45	support for this, though the way it works is a bit unexpected for some.</p>
				46
				47	</div>
				48
				49	<!-- *********************************************************************** -->
				50	<div class="doc_section"><a name="why">Why is this a hard problem?</a></div>
				51	<!-- *********************************************************************** -->
				52
				53	<div class="doc_text">
				54
				55	<p>
				56	To understand why mutable variables cause complexities in SSA construction,
				57	consider this extremely simple C example:
				58	</p>
				59
				60	<div class="doc_code">
				61	<pre>
				62	int G, H;
				63	int test(_Bool Condition) {
				64	int X;
				65	if (Condition)
				66	X = G;
				67	else
				68	X = H;
				69	return X;
				70	}
				71	</pre>
				72	</div>
				73
				74	<p>In this case, we have the variable "X", whose value depends on the path
				75	executed in the program. Because there are two different possible values for X
				76	before the return instruction, a PHI node is inserted to merge the two values.
				77	The LLVM IR that we want for this example looks like this:</p>
				78
				79	<div class="doc_code">
				80	<pre>
				81	@G = weak global i32 0 ; type of @G is i32*
				82	@H = weak global i32 0 ; type of @H is i32*
				83
				84	define i32 @test(i1 %Condition) {
				85	entry:
				86	br i1 %Condition, label %cond_true, label %cond_false
				87
				88	cond_true:
				89	%X.0 = load i32* @G
				90	br label %cond_next
				91
				92	cond_false:
				93	%X.1 = load i32* @H
				94	br label %cond_next
				95
				96	cond_next:
				97	%X.2 = phi i32 [ %X.1, %cond_false ], [ %X.0, %cond_true ]
				98	ret i32 %X.2
				99	}
				100	</pre>
				101	</div>
				102
				103	<p>In this example, the loads from the G and H global variables are explicit in
				104	the LLVM IR, and they live in the then/else branches of the if statement
				105	(cond_true/cond_false). In order to merge the incoming values, the X.2 phi node
				106	in the cond_next block selects the right value to use based on where control
				107	flow is coming from: if control flow comes from the cond_false block, X.2 gets
				108	the value of X.1. Alternatively, if control flow comes from cond_tree, it gets
				109	the value of X.0. The intent of this chapter is not to explain the details of
				110	SSA form. For more information, see one of the many <a
				111	href="http://en.wikipedia.org/wiki/Static_single_assignment_form">online
				112	references</a>.</p>
				113
				114	<p>The question for this article is "who places phi nodes when lowering
				115	assignments to mutable variables?". The issue here is that LLVM
				116	<em>requires</em> that its IR be in SSA form: there is no "non-ssa" mode for it.
				117	However, SSA construction requires non-trivial algorithms and data structures,
				118	so it is inconvenient and wasteful for every front-end to have to reproduce this
				119	logic.</p>
				120
				121	</div>
				122
				123	<!-- *********************************************************************** -->
				124	<div class="doc_section"><a name="memory">Memory in LLVM</a></div>
				125	<!-- *********************************************************************** -->
				126
				127	<div class="doc_text">
				128
				129	<p>The 'trick' here is that while LLVM does require all register values to be
				130	in SSA form, it does not require (or permit) memory objects to be in SSA form.
				131	In the example above, note that the loads from G and H are direct accesses to
				132	G and H: they are not renamed or versioned. This differs from some other
Chris Lattner	2e5d07e	2007-11-04 19:42:13 +0000	[diff] [blame]	133	compiler systems, which do try to version memory objects. In LLVM, instead of
Chris Lattner	00c992d	2007-11-03 08:55:29 +0000	[diff] [blame]	134	encoding dataflow analysis of memory into the LLVM IR, it is handled with <a
				135	href="../WritingAnLLVMPass.html">Analysis Passes</a> which are computed on
				136	demand.</p>
				137
				138	<p>
				139	With this in mind, the high-level idea is that we want to make a stack variable
				140	(which lives in memory, because it is on the stack) for each mutable object in
				141	a function. To take advantage of this trick, we need to talk about how LLVM
				142	represents stack variables.
				143	</p>
				144
				145	<p>In LLVM, all memory accesses are explicit with load/store instructions, and
				146	it is carefully designed to not have (or need) an "address-of" operator. Notice
				147	how the type of the @G/@H global variables is actually "i32*" even though the
				148	variable is defined as "i32". What this means is that @G defines <em>space</em>
				149	for an i32 in the global data area, but its <em>name</em> actually refers to the
				150	address for that space. Stack variables work the same way, but instead of being
				151	declared with global variable definitions, they are declared with the
				152	<a href="../LangRef.html#i_alloca">LLVM alloca instruction</a>:</p>
				153
				154	<div class="doc_code">
				155	<pre>
				156	define i32 @test(i1 %Condition) {
				157	entry:
				158	%X = alloca i32 ; type of %X is i32*.
				159	...
				160	%tmp = load i32* %X ; load the stack value %X from the stack.
				161	%tmp2 = add i32 %tmp, 1 ; increment it
				162	store i32 %tmp2, i32* %X ; store it back
				163	...
				164	</pre>
				165	</div>
				166
				167	<p>This code shows an example of how you can declare and manipulate a stack
				168	variable in the LLVM IR. Stack memory allocated with the alloca instruction is
				169	fully general: you can pass the address of the stack slot to functions, you can
				170	store it in other variables, etc. In our example above, we could rewrite the
				171	example to use the alloca technique to avoid using a PHI node:</p>
				172
				173	<div class="doc_code">
				174	<pre>
				175	@G = weak global i32 0 ; type of @G is i32*
				176	@H = weak global i32 0 ; type of @H is i32*
				177
				178	define i32 @test(i1 %Condition) {
				179	entry:
				180	%X = alloca i32 ; type of %X is i32*.
				181	br i1 %Condition, label %cond_true, label %cond_false
				182
				183	cond_true:
				184	%X.0 = load i32* @G
				185	store i32 %X.0, i32* %X ; Update X
				186	br label %cond_next
				187
				188	cond_false:
				189	%X.1 = load i32* @H
				190	store i32 %X.1, i32* %X ; Update X
				191	br label %cond_next
				192
				193	cond_next:
				194	%X.2 = load i32* %X ; Read X
				195	ret i32 %X.2
				196	}
				197	</pre>
				198	</div>
				199
				200	<p>With this, we have discovered a way to handle arbitrary mutable variables
				201	without the need to create Phi nodes at all:</p>
				202
				203	<ol>
				204	<li>Each mutable variable becomes a stack allocation.</li>
				205	<li>Each read of the variable becomes a load from the stack.</li>
				206	<li>Each update of the variable becomes a store to the stack.</li>
				207	<li>Taking the address of a variable just uses the stack address directly.</li>
				208	</ol>
				209
				210	<p>While this solution has solved our immediate problem, it introduced another
				211	one: we have now apparently introduced a lot of stack traffic for very simple
				212	and common operations, a major performance problem. Fortunately for us, the
				213	LLVM optimizer has a highly-tuned optimization pass named "mem2reg" that handles
				214	this case, promoting allocas like this into SSA registers, inserting Phi nodes
				215	as appropriate. If you run this example through the pass, for example, you'll
				216	get:</p>
				217
				218	<div class="doc_code">
				219	<pre>
				220	$ <b>llvm-as < example.ll \| opt -mem2reg \| llvm-dis</b>
				221	@G = weak global i32 0
				222	@H = weak global i32 0
				223
				224	define i32 @test(i1 %Condition) {
				225	entry:
				226	br i1 %Condition, label %cond_true, label %cond_false
				227
				228	cond_true:
				229	%X.0 = load i32* @G
				230	br label %cond_next
				231
				232	cond_false:
				233	%X.1 = load i32* @H
				234	br label %cond_next
				235
				236	cond_next:
				237	%X.01 = phi i32 [ %X.1, %cond_false ], [ %X.0, %cond_true ]
				238	ret i32 %X.01
				239	}
				240	</pre>
Chris Lattner	e719831	2007-11-03 22:22:30 +0000	[diff] [blame]	241	</div>
Chris Lattner	00c992d	2007-11-03 08:55:29 +0000	[diff] [blame]	242
Chris Lattner	e719831	2007-11-03 22:22:30 +0000	[diff] [blame]	243	<p>The mem2reg pass implements the standard "iterated dominator frontier"
				244	algorithm for constructing SSA form and has a number of optimizations that speed
				245	up very common degenerate cases. mem2reg really is the answer for dealing with
				246	mutable variables, and we highly recommend that you depend on it. Note that
				247	mem2reg only works on variables in certain circumstances:</p>
Chris Lattner	00c992d	2007-11-03 08:55:29 +0000	[diff] [blame]	248
Chris Lattner	e719831	2007-11-03 22:22:30 +0000	[diff] [blame]	249	<ol>
				250	<li>mem2reg is alloca-driven: it looks for allocas and if it can handle them, it
				251	promotes them. It does not apply to global variables or heap allocations.</li>
Chris Lattner	00c992d	2007-11-03 08:55:29 +0000	[diff] [blame]	252
Chris Lattner	e719831	2007-11-03 22:22:30 +0000	[diff] [blame]	253	<li>mem2reg only looks for alloca instructions in the entry block of the
				254	function. Being in the entry block guarantees that the alloca is only executed
				255	once, which makes analysis simpler.</li>
Chris Lattner	00c992d	2007-11-03 08:55:29 +0000	[diff] [blame]	256
Chris Lattner	e719831	2007-11-03 22:22:30 +0000	[diff] [blame]	257	<li>mem2reg only promotes allocas whose uses are direct loads and stores. If
				258	the address of the stack object is passed to a function, or if any funny pointer
				259	arithmetic is involved, the alloca will not be promoted.</li>
				260
				261	<li>mem2reg only works on allocas of scalar values, and only if the array size
				262	of the allocation is 1 (or missing in the .ll file). mem2reg is not capable of
				263	promoting structs or arrays to registers. Note that the "scalarrepl" pass is
				264	more powerful and can promote structs, "unions", and arrays in many cases.</li>
				265
				266	</ol>
				267
				268	<p>
				269	All of these properties are easy to satisfy for most imperative languages, and
Chris Lattner	2e5d07e	2007-11-04 19:42:13 +0000	[diff] [blame]	270	we'll illustrate this below with Kaleidoscope. The final question you may be
Chris Lattner	e719831	2007-11-03 22:22:30 +0000	[diff] [blame]	271	asking is: should I bother with this nonsense for my front-end? Wouldn't it be
				272	better if I just did SSA construction directly, avoiding use of the mem2reg
				273	optimization pass? In short, we strongly recommend that use you this technique
				274	for building SSA form, unless there is an extremely good reason not to. Using
				275	this technique is:</p>
				276
				277	<ul>
				278	<li>Proven and well tested: llvm-gcc and clang both use this technique for local
				279	mutable variables. As such, the most common clients of LLVM are using this to
				280	handle a bulk of their variables. You can be sure that bugs are found fast and
				281	fixed early.</li>
				282
				283	<li>Extremely Fast: mem2reg has a number of special cases that make it fast in
				284	common cases as well as fully general. For example, it has fast-paths for
				285	variables that are only used in a single block, variables that only have one
				286	assignment point, good heuristics to avoid insertion of unneeded phi nodes, etc.
				287	</li>
				288
				289	<li>Needed for debug info generation: <a href="../SourceLevelDebugging.html">
				290	Debug information in LLVM</a> relies on having the address of the variable
				291	exposed to attach debug info to it. This technique dovetails very naturally
				292	with this style of debug info.</li>
				293	</ul>
				294
				295	<p>If nothing else, this makes it much easier to get your front-end up and
				296	running, and is very simple to implement. Lets extend Kaleidoscope with mutable
				297	variables now!
Chris Lattner	00c992d	2007-11-03 08:55:29 +0000	[diff] [blame]	298	</p>
Chris Lattner	62a709d	2007-11-05 00:23:57 +0000	[diff] [blame^]	299
Chris Lattner	00c992d	2007-11-03 08:55:29 +0000	[diff] [blame]	300	</div>
				301
Chris Lattner	62a709d	2007-11-05 00:23:57 +0000	[diff] [blame^]	302	<!-- *********************************************************************** -->
				303	<div class="doc_section"><a name="kalvars">Mutable Variables in
				304	Kaleidoscope</a></div>
				305	<!-- *********************************************************************** -->
				306
				307	<div class="doc_text">
				308
				309	<p>Now that we know the sort of problem we want to tackle, lets see what this
				310	looks like in the context of our little Kaleidoscope language. We're going to
				311	add two features:</p>
				312
				313	<ol>
				314	<li>The ability to mutate variables with the '=' operator.</li>
				315	<li>The ability to define new variables.</li>
				316	</ol>
				317
				318	<p>While the first item is really what this is about, we only have variables
				319	for incoming arguments and for induction variables, and redefining them only
				320	goes so far :). Also, the ability to define new variables is a
				321	useful thing regardless of whether you will be mutating them. Here's a
				322	motivating example that shows how we could use these:</p>
				323
				324	<div class="doc_code">
				325	<pre>
				326	# Define ':' for sequencing: as a low-precedence operator that ignores operands
				327	# and just returns the RHS.
				328	def binary : 1 (x y) y;
				329
				330	# Recursive fib, we could do this before.
				331	def fib(x)
				332	if (x < 3) then
				333	1
				334	else
				335	fib(x-1)+fib(x-2);
				336
				337	# Iterative fib.
				338	def fibi(x)
				339	<b>var a = 1, b = 1, c in</b>
				340	(for i = 3, i &;t; x in
				341	<b>c = a + b</b> :
				342	<b>a = b</b> :
				343	<b>b = c</b>) :
				344	b;
				345
				346	# Call it.
				347	fibi(10);
				348	</pre>
				349	</div>
				350
				351	<p>
				352	In order to mutate variables, we have to change our existing variables to use
				353	the "alloca trick". Once we have that, we'll add our new operator, then extend
				354	Kaleidoscope to support new variable definitions.
				355	</p>
				356
				357	</div>
				358
				359	<!-- *********************************************************************** -->
				360	<div class="doc_section"><a name="adjustments">Adjusting Existing Variables for
				361	Mutation</a></div>
				362	<!-- *********************************************************************** -->
				363
				364	<div class="doc_text">
				365
				366	<p>
				367	The symbol table in Kaleidoscope is managed at code generation time by the
				368	'<tt>NamedValues</tt>' map. This map currently keeps track of the LLVM "Value*"
				369	that holds the double value for the named variable. In order to support
				370	mutation, we need to change this slightly, so that it <tt>NamedValues</tt> holds
				371	the <em>memory location</em> of the variable in question. Note that this
				372	change is a refactoring: it changes the structure of the code, but does not
				373	(by itself) change the behavior of the compiler. All of these changes are
				374	isolated in the Kaleidoscope code generator.</p>
				375
				376	<p>
				377	At this point in Kaleidoscope's development, it only supports variables for two
				378	things: incoming arguments to functions and the induction variable of 'for'
				379	loops. For consistency, we'll allow mutation of these variables in addition to
				380	other user-defined variables. This means that these will both need memory
				381	locations.
				382	</p>
				383
				384	<p>To start our transformation of Kaleidoscope, we'll change the NamedValues
				385	map to map to AllocaInst* instead of Value*. Once we do this, the C++ compiler
				386	will tell use what parts of the code we need to update:</p>
				387
				388	<div class="doc_code">
				389	<pre>
				390	static std::map<std::string, AllocaInst*> NamedValues;
				391	</pre>
				392	</div>
				393
				394	<p>Also, since we will need to create these alloca's, we'll use a helper
				395	function that ensures that the allocas are created in the entry block of the
				396	function:</p>
				397
				398	<div class="doc_code">
				399	<pre>
				400	/// CreateEntryBlockAlloca - Create an alloca instruction in the entry block of
				401	/// the function. This is used for mutable variables etc.
				402	static AllocaInst CreateEntryBlockAlloca(Function TheFunction,
				403	const std::string &VarName) {
				404	LLVMBuilder TmpB(&TheFunction->getEntryBlock(),
				405	TheFunction->getEntryBlock().begin());
				406	return TmpB.CreateAlloca(Type::DoubleTy, 0, VarName.c_str());
				407	}
				408	</pre>
				409	</div>
				410
				411	<p>This funny looking code creates an LLVMBuilder object that is pointing at
				412	the first instruction (.begin()) of the entry block. It then creates an alloca
				413	with the expected name and returns it. Because all values in Kaleidoscope are
				414	doubles, there is no need to pass in a type to use.</p>
				415
				416	<p>With this in place, the first functionality change we want to make is to
				417	variable references. In our new scheme, variables live on the stack, so code
				418	generating a reference to them actually needs to produce a load from the stack
				419	slot:</p>
				420
				421	<div class="doc_code">
				422	<pre>
				423	Value *VariableExprAST::Codegen() {
				424	// Look this variable up in the function.
				425	Value *V = NamedValues[Name];
				426	if (V == 0) return ErrorV("Unknown variable name");
				427
				428	// Load the value.
				429	return Builder.CreateLoad(V, Name.c_str());
				430	}
				431	</pre>
				432	</div>
				433
				434	<p>As you can see, this is pretty straight-forward. Next we need to update the
				435	things that define the variables to set up the alloca. We'll start with
				436	<tt>ForExprAST::Codegen</tt> (see the <a href="#code">full code listing</a> for
				437	the unabridged code):</p>
				438
				439	<div class="doc_code">
				440	<pre>
				441	Function *TheFunction = Builder.GetInsertBlock()->getParent();
				442
				443	<b>// Create an alloca for the variable in the entry block.
				444	AllocaInst *Alloca = CreateEntryBlockAlloca(TheFunction, VarName);</b>
				445
				446	// Emit the start code first, without 'variable' in scope.
				447	Value *StartVal = Start->Codegen();
				448	if (StartVal == 0) return 0;
				449
				450	<b>// Store the value into the alloca.
				451	Builder.CreateStore(StartVal, Alloca);</b>
				452	...
				453
				454	// Compute the end condition.
				455	Value *EndCond = End->Codegen();
				456	if (EndCond == 0) return EndCond;
				457
				458	<b>// Reload, increment, and restore the alloca. This handles the case where
				459	// the body of the loop mutates the variable.
				460	Value *CurVar = Builder.CreateLoad(Alloca);
				461	Value *NextVar = Builder.CreateAdd(CurVar, StepVal, "nextvar");
				462	Builder.CreateStore(NextVar, Alloca);</b>
				463	...
				464	</pre>
				465	</div>
				466
				467	<p>This code is virtually identical to the code <a
				468	href="LangImpl5.html#forcodegen">before we allowed mutable variables</a>. The
				469	big difference is that we no longer have to construct a PHI node, and we use
				470	load/store to access the variable as needed.</p>
				471
				472	<p>To support mutable argument variables, we need to also make allocas for them.
				473	The code for this is also pretty simple:</p>
				474
				475	<div class="doc_code">
				476	<pre>
				477	/// CreateArgumentAllocas - Create an alloca for each argument and register the
				478	/// argument in the symbol table so that references to it will succeed.
				479	void PrototypeAST::CreateArgumentAllocas(Function *F) {
				480	Function::arg_iterator AI = F->arg_begin();
				481	for (unsigned Idx = 0, e = Args.size(); Idx != e; ++Idx, ++AI) {
				482	// Create an alloca for this variable.
				483	AllocaInst *Alloca = CreateEntryBlockAlloca(F, Args[Idx]);
				484
				485	// Store the initial value into the alloca.
				486	Builder.CreateStore(AI, Alloca);
				487
				488	// Add arguments to variable symbol table.
				489	NamedValues[Args[Idx]] = Alloca;
				490	}
				491	}
				492	</pre>
				493	</div>
				494
				495	<p>For each argument, we make an alloca, store the input value to the function
				496	into the alloca, and register the alloca as the memory location for the
				497	argument. This method gets invoked by <tt>FunctionAST::Codegen</tt> right after
				498	it sets up the entry block for the function.</p>
				499
				500	<p>The final missing piece is adding the 'mem2reg' pass, which allows us to get
				501	good codegen once again:</p>
				502
				503	<div class="doc_code">
				504	<pre>
				505	// Set up the optimizer pipeline. Start with registering info about how the
				506	// target lays out data structures.
				507	OurFPM.add(new TargetData(*TheExecutionEngine->getTargetData()));
				508	<b>// Promote allocas to registers.
				509	OurFPM.add(createPromoteMemoryToRegisterPass());</b>
				510	// Do simple "peephole" optimizations and bit-twiddling optzns.
				511	OurFPM.add(createInstructionCombiningPass());
				512	// Reassociate expressions.
				513	OurFPM.add(createReassociatePass());
				514	</pre>
				515	</div>
				516
				517	<p>It is interesting to see what the code looks like before and after the
				518	mem2reg optimization runs. For example, this is the before/after code for our
				519	recursive fib. Before the optimization:</p>
				520
				521	<div class="doc_code">
				522	<pre>
				523	define double @fib(double %x) {
				524	entry:
				525	<b>%x1 = alloca double
				526	store double %x, double* %x1
				527	%x2 = load double* %x1</b>
				528	%multmp = fcmp ult double %x2, 3.000000e+00
				529	%booltmp = uitofp i1 %multmp to double
				530	%ifcond = fcmp one double %booltmp, 0.000000e+00
				531	br i1 %ifcond, label %then, label %else
				532
				533	then: ; preds = %entry
				534	br label %ifcont
				535
				536	else: ; preds = %entry
				537	<b>%x3 = load double* %x1</b>
				538	%subtmp = sub double %x3, 1.000000e+00
				539	%calltmp = call double @fib( double %subtmp )
				540	<b>%x4 = load double* %x1</b>
				541	%subtmp5 = sub double %x4, 2.000000e+00
				542	%calltmp6 = call double @fib( double %subtmp5 )
				543	%addtmp = add double %calltmp, %calltmp6
				544	br label %ifcont
				545
				546	ifcont: ; preds = %else, %then
				547	%iftmp = phi double [ 1.000000e+00, %then ], [ %addtmp, %else ]
				548	ret double %iftmp
				549	}
				550	</pre>
				551	</div>
				552
				553	<p>Here there is only one variable (x, the input argument) but you can still
				554	see the extremely simple-minded code generation strategy we are using. In the
				555	entry block, an alloca is created, and the initial input value is stored into
				556	it. Each reference to the variable does a reload from the stack. Also, note
				557	that we didn't modify the if/then/else expression, so it still inserts a PHI
				558	node. While we could make an alloca for it, it is actually easier to create a
				559	PHI node for it, so we still just make the PHI.</p>
				560
				561	<p>Here is the code after the mem2reg pass runs:</p>
				562
				563	<div class="doc_code">
				564	<pre>
				565	define double @fib(double %x) {
				566	entry:
				567	%multmp = fcmp ult double <b>%x</b>, 3.000000e+00
				568	%booltmp = uitofp i1 %multmp to double
				569	%ifcond = fcmp one double %booltmp, 0.000000e+00
				570	br i1 %ifcond, label %then, label %else
				571
				572	then:
				573	br label %ifcont
				574
				575	else:
				576	%subtmp = sub double <b>%x</b>, 1.000000e+00
				577	%calltmp = call double @fib( double %subtmp )
				578	%subtmp5 = sub double <b>%x</b>, 2.000000e+00
				579	%calltmp6 = call double @fib( double %subtmp5 )
				580	%addtmp = add double %calltmp, %calltmp6
				581	br label %ifcont
				582
				583	ifcont: ; preds = %else, %then
				584	%iftmp = phi double [ 1.000000e+00, %then ], [ %addtmp, %else ]
				585	ret double %iftmp
				586	}
				587	</pre>
				588	</div>
				589
				590	<p>This is a trivial case for mem2reg, since there are no redefinitions of the
				591	variable. The point of showing this is to calm your tension about inserting
				592	such blatent inefficiencies :).</p>
				593
				594	<p>After the rest of the optimizers run, we get:</p>
				595
				596	<div class="doc_code">
				597	<pre>
				598	define double @fib(double %x) {
				599	entry:
				600	%multmp = fcmp ult double %x, 3.000000e+00
				601	%booltmp = uitofp i1 %multmp to double
				602	%ifcond = fcmp ueq double %booltmp, 0.000000e+00
				603	br i1 %ifcond, label %else, label %ifcont
				604
				605	else:
				606	%subtmp = sub double %x, 1.000000e+00
				607	%calltmp = call double @fib( double %subtmp )
				608	%subtmp5 = sub double %x, 2.000000e+00
				609	%calltmp6 = call double @fib( double %subtmp5 )
				610	%addtmp = add double %calltmp, %calltmp6
				611	ret double %addtmp
				612
				613	ifcont:
				614	ret double 1.000000e+00
				615	}
				616	</pre>
				617	</div>
				618
				619	<p>Here we see that the simplifycfg pass decided to clone the return instruction
				620	into the end of the 'else' block. This allowed it to eliminate some branches
				621	and the PHI node.</p>
				622
				623	<p>Now that all symbol table references are updated to use stack variables,
				624	we'll add the assignment operator.</p>
				625
				626	</div>
				627
				628	<!-- *********************************************************************** -->
				629	<div class="doc_section"><a name="assignment">New Assignment Operator</a></div>
				630	<!-- *********************************************************************** -->
				631
				632	<div class="doc_text">
				633
				634	<p>With our current framework, adding a new assignment operator is really
				635	simple. We will parse it just like any other binary operator, but handle it
				636	internally (instead of allowing the user to define it). The first step is to
				637	set a precedence:</p>
				638
				639	<div class="doc_code">
				640	<pre>
				641	int main() {
				642	// Install standard binary operators.
				643	// 1 is lowest precedence.
				644	<b>BinopPrecedence['='] = 2;</b>
				645	BinopPrecedence['<'] = 10;
				646	BinopPrecedence['+'] = 20;
				647	BinopPrecedence['-'] = 20;
				648	</pre>
				649	</div>
				650
				651	<p>Now that the parser knows the precedence of the binary operator, it takes
				652	care of all the parsing and AST generation. We just need to implement codegen
				653	for the assignment operator. This looks like:</p>
				654
				655	<div class="doc_code">
				656	<pre>
				657	Value *BinaryExprAST::Codegen() {
				658	// Special case '=' because we don't want to emit the LHS as an expression.
				659	if (Op == '=') {
				660	// Assignment requires the LHS to be an identifier.
				661	VariableExprAST LHSE = dynamic_cast<VariableExprAST>(LHS);
				662	if (!LHSE)
				663	return ErrorV("destination of '=' must be a variable");
				664	</pre>
				665	</div>
				666
				667	<p>Unlike the rest of the binary operators, our assignment operator doesn't
				668	follow the "emit LHS, emit RHS, do computation" model. As such, it is handled
				669	as a special case before the other binary operators are handled. The other
				670	strange thing about it is that it requires the LHS to be a variable directly.
				671	</p>
				672
				673	<div class="doc_code">
				674	<pre>
				675	// Codegen the RHS.
				676	Value *Val = RHS->Codegen();
				677	if (Val == 0) return 0;
				678
				679	// Look up the name.
				680	Value *Variable = NamedValues[LHSE->getName()];
				681	if (Variable == 0) return ErrorV("Unknown variable name");
				682
				683	Builder.CreateStore(Val, Variable);
				684	return Val;
				685	}
				686	...
				687	</pre>
				688	</div>
				689
				690	<p>Once it has the variable, codegen'ing the assignment is straight-forward:
				691	we emit the RHS of the assignment, create a store, and return the computed
				692	value. Returning a value allows for chained assignments like "X = (Y = Z)".</p>
				693
				694	<p>Now that we have an assignment operator, we can mutate loop variables and
				695	arguments. For example, we can now run code like this:</p>
				696
				697	<div class="doc_code">
				698	<pre>
				699	# Function to print a double.
				700	extern printd(x);
				701
				702	# Define ':' for sequencing: as a low-precedence operator that ignores operands
				703	# and just returns the RHS.
				704	def binary : 1 (x y) y;
				705
				706	def test(x)
				707	printd(x) :
				708	x = 4 :
				709	printd(x);
				710
				711	test(123);
				712	</pre>
				713	</div>
				714
				715	<p>When run, this example prints "123" and then "4", showing that we did
				716	actually mutate the value! Okay, we have now officially implemented our goal:
				717	getting this to work requires SSA construction in the general case. However,
				718	to be really useful, we want the ability to define our own local variables, lets
				719	add this next!
				720	</p>
				721
				722	</div>
				723
				724	<!-- *********************************************************************** -->
				725	<div class="doc_section"><a name="localvars">User-defined Local
				726	Variables</a></div>
				727	<!-- *********************************************************************** -->
				728
				729	<div class="doc_text">
				730
				731	<p>Adding var/in is just like any other other extensions we made to
				732	Kaleidoscope: we extend the lexer, the parser, the AST and the code generator.
				733	The first step for adding our new 'var/in' construct is to extend the lexer.
				734	As before, this is pretty trivial, the code looks like this:</p>
				735
				736	<div class="doc_code">
				737	<pre>
				738	enum Token {
				739	...
				740	<b>// var definition
				741	tok_var = -13</b>
				742	...
				743	}
				744	...
				745	static int gettok() {
				746	...
				747	if (IdentifierStr == "in") return tok_in;
				748	if (IdentifierStr == "binary") return tok_binary;
				749	if (IdentifierStr == "unary") return tok_unary;
				750	<b>if (IdentifierStr == "var") return tok_var;</b>
				751	return tok_identifier;
				752	...
				753	</pre>
				754	</div>
				755
				756	<p>The next step is to define the AST node that we will construct. For var/in,
				757	it will look like this:</p>
				758
				759	<div class="doc_code">
				760	<pre>
				761	/// VarExprAST - Expression class for var/in
				762	class VarExprAST : public ExprAST {
				763	std::vector<std::pair<std::string, ExprAST*> > VarNames;
				764	ExprAST *Body;
				765	public:
				766	VarExprAST(const std::vector<std::pair<std::string, ExprAST*> > &varnames,
				767	ExprAST *body)
				768	: VarNames(varnames), Body(body) {}
				769
				770	virtual Value *Codegen();
				771	};
				772	</pre>
				773	</div>
				774
				775	<p>var/in allows a list of names to be defined all at once, and each name can
				776	optionally have an initializer value. As such, we capture this information in
				777	the VarNames vector. Also, var/in has a body, this body is allowed to access
				778	the variables defined by the let/in.</p>
				779
				780	<p>With this ready, we can define the parser pieces. First thing we do is add
				781	it as a primary expression:</p>
				782
				783	<div class="doc_code">
				784	<pre>
				785	/// primary
				786	/// ::= identifierexpr
				787	/// ::= numberexpr
				788	/// ::= parenexpr
				789	/// ::= ifexpr
				790	/// ::= forexpr
				791	<b>/// ::= varexpr</b>
				792	static ExprAST *ParsePrimary() {
				793	switch (CurTok) {
				794	default: return Error("unknown token when expecting an expression");
				795	case tok_identifier: return ParseIdentifierExpr();
				796	case tok_number: return ParseNumberExpr();
				797	case '(': return ParseParenExpr();
				798	case tok_if: return ParseIfExpr();
				799	case tok_for: return ParseForExpr();
				800	<b>case tok_var: return ParseVarExpr();</b>
				801	}
				802	}
				803	</pre>
				804	</div>
				805
				806	<p>Next we define ParseVarExpr:</p>
				807
				808	<div class="doc_code">
				809	<pre>
				810	/// varexpr ::= 'var' identifer ('=' expression)?
				811	// (',' identifer ('=' expression)?)* 'in' expression
				812	static ExprAST *ParseVarExpr() {
				813	getNextToken(); // eat the var.
				814
				815	std::vector<std::pair<std::string, ExprAST*> > VarNames;
				816
				817	// At least one variable name is required.
				818	if (CurTok != tok_identifier)
				819	return Error("expected identifier after var");
				820	</pre>
				821	</div>
				822
				823	<p>The first part of this code parses the list of identifier/expr pairs into the
				824	local <tt>VarNames</tt> vector.
				825
				826	<div class="doc_code">
				827	<pre>
				828	while (1) {
				829	std::string Name = IdentifierStr;
				830	getNextToken(); // eat identifer.
				831
				832	// Read the optional initializer.
				833	ExprAST *Init = 0;
				834	if (CurTok == '=') {
				835	getNextToken(); // eat the '='.
				836
				837	Init = ParseExpression();
				838	if (Init == 0) return 0;
				839	}
				840
				841	VarNames.push_back(std::make_pair(Name, Init));
				842
				843	// End of var list, exit loop.
				844	if (CurTok != ',') break;
				845	getNextToken(); // eat the ','.
				846
				847	if (CurTok != tok_identifier)
				848	return Error("expected identifier list after var");
				849	}
				850	</pre>
				851	</div>
				852
				853	<p>Once all the variables are parsed, we then parse the body and create the
				854	AST node:</p>
				855
				856	<div class="doc_code">
				857	<pre>
				858	// At this point, we have to have 'in'.
				859	if (CurTok != tok_in)
				860	return Error("expected 'in' keyword after 'var'");
				861	getNextToken(); // eat 'in'.
				862
				863	ExprAST *Body = ParseExpression();
				864	if (Body == 0) return 0;
				865
				866	return new VarExprAST(VarNames, Body);
				867	}
				868	</pre>
				869	</div>
				870
				871	<p>Now that we can parse and represent the code, we need to support emission of
				872	LLVM IR for it. This code starts out with:</p>
				873
				874	<div class="doc_code">
				875	<pre>
				876	Value *VarExprAST::Codegen() {
				877	std::vector<AllocaInst *> OldBindings;
				878
				879	Function *TheFunction = Builder.GetInsertBlock()->getParent();
				880
				881	// Register all variables and emit their initializer.
				882	for (unsigned i = 0, e = VarNames.size(); i != e; ++i) {
				883	const std::string &VarName = VarNames[i].first;
				884	ExprAST *Init = VarNames[i].second;
				885	</pre>
				886	</div>
				887
				888	<p>Basically it loops over all the variables, installing them one at a time.
				889	For each variable we put into the symbol table, we remember the previous value
				890	that we replace in OldBindings.</p>
				891
				892	<div class="doc_code">
				893	<pre>
				894	// Emit the initializer before adding the variable to scope, this prevents
				895	// the initializer from referencing the variable itself, and permits stuff
				896	// like this:
				897	// var a = 1 in
				898	// var a = a in ... # refers to outer 'a'.
				899	Value *InitVal;
				900	if (Init) {
				901	InitVal = Init->Codegen();
				902	if (InitVal == 0) return 0;
				903	} else { // If not specified, use 0.0.
				904	InitVal = ConstantFP::get(Type::DoubleTy, APFloat(0.0));
				905	}
				906
				907	AllocaInst *Alloca = CreateEntryBlockAlloca(TheFunction, VarName);
				908	Builder.CreateStore(InitVal, Alloca);
				909
				910	// Remember the old variable binding so that we can restore the binding when
				911	// we unrecurse.
				912	OldBindings.push_back(NamedValues[VarName]);
				913
				914	// Remember this binding.
				915	NamedValues[VarName] = Alloca;
				916	}
				917	</pre>
				918	</div>
				919
				920	<p>There are more comments here than code. The basic idea is that we emit the
				921	initializer, create the alloca, then update the symbol table to point to it.
				922	Once all the variables are installed in the symbol table, we evaluate the body
				923	of the var/in expression:</p>
				924
				925	<div class="doc_code">
				926	<pre>
				927	// Codegen the body, now that all vars are in scope.
				928	Value *BodyVal = Body->Codegen();
				929	if (BodyVal == 0) return 0;
				930	</pre>
				931	</div>
				932
				933	<p>Finally, before returning, we restore the previous variable bindings:</p>
				934
				935	<div class="doc_code">
				936	<pre>
				937	// Pop all our variables from scope.
				938	for (unsigned i = 0, e = VarNames.size(); i != e; ++i)
				939	NamedValues[VarNames[i].first] = OldBindings[i];
				940
				941	// Return the body computation.
				942	return BodyVal;
				943	}
				944	</pre>
				945	</div>
				946
				947	<p>The end result of all of this is that we get properly scoped variable
				948	definitions, and we even (trivially) allow mutation of them :).</p>
				949
				950	<p>With this, we completed what we set out to do. Our nice iterative fib
				951	example from the intro compiles and runs just fine. The mem2reg pass optimizes
				952	all of our stack variables into SSA registers, inserting PHI nodes where needed,
				953	and our front-end remains simple: no iterated dominator frontier computation
				954	anywhere in sight.</p>
				955
				956	</div>
Chris Lattner	00c992d	2007-11-03 08:55:29 +0000	[diff] [blame]	957
				958	<!-- *********************************************************************** -->
				959	<div class="doc_section"><a name="code">Full Code Listing</a></div>
				960	<!-- *********************************************************************** -->
				961
				962	<div class="doc_text">
				963
				964	<p>
Chris Lattner	62a709d	2007-11-05 00:23:57 +0000	[diff] [blame^]	965	Here is the complete code listing for our running example, enhanced with mutable
				966	variables and var/in support. To build this example, use:
Chris Lattner	00c992d	2007-11-03 08:55:29 +0000	[diff] [blame]	967	</p>
				968
				969	<div class="doc_code">
				970	<pre>
				971	# Compile
				972	g++ -g toy.cpp `llvm-config --cppflags --ldflags --libs core jit native` -O3 -o toy
				973	# Run
				974	./toy
				975	</pre>
				976	</div>
				977
				978	<p>Here is the code:</p>
				979
				980	<div class="doc_code">
				981	<pre>
Chris Lattner	62a709d	2007-11-05 00:23:57 +0000	[diff] [blame^]	982	#include "llvm/DerivedTypes.h"
				983	#include "llvm/ExecutionEngine/ExecutionEngine.h"
				984	#include "llvm/Module.h"
				985	#include "llvm/ModuleProvider.h"
				986	#include "llvm/PassManager.h"
				987	#include "llvm/Analysis/Verifier.h"
				988	#include "llvm/Target/TargetData.h"
				989	#include "llvm/Transforms/Scalar.h"
				990	#include "llvm/Support/LLVMBuilder.h"
				991	#include <cstdio>
				992	#include <string>
				993	#include <map>
				994	#include <vector>
				995	using namespace llvm;
				996
				997	//===----------------------------------------------------------------------===//
				998	// Lexer
				999	//===----------------------------------------------------------------------===//
				1000
				1001	// The lexer returns tokens [0-255] if it is an unknown character, otherwise one
				1002	// of these for known things.
				1003	enum Token {
				1004	tok_eof = -1,
				1005
				1006	// commands
				1007	tok_def = -2, tok_extern = -3,
				1008
				1009	// primary
				1010	tok_identifier = -4, tok_number = -5,
				1011
				1012	// control
				1013	tok_if = -6, tok_then = -7, tok_else = -8,
				1014	tok_for = -9, tok_in = -10,
				1015
				1016	// operators
				1017	tok_binary = -11, tok_unary = -12,
				1018
				1019	// var definition
				1020	tok_var = -13
				1021	};
				1022
				1023	static std::string IdentifierStr; // Filled in if tok_identifier
				1024	static double NumVal; // Filled in if tok_number
				1025
				1026	/// gettok - Return the next token from standard input.
				1027	static int gettok() {
				1028	static int LastChar = ' ';
				1029
				1030	// Skip any whitespace.
				1031	while (isspace(LastChar))
				1032	LastChar = getchar();
				1033
				1034	if (isalpha(LastChar)) { // identifier: [a-zA-Z][a-zA-Z0-9]*
				1035	IdentifierStr = LastChar;
				1036	while (isalnum((LastChar = getchar())))
				1037	IdentifierStr += LastChar;
				1038
				1039	if (IdentifierStr == "def") return tok_def;
				1040	if (IdentifierStr == "extern") return tok_extern;
				1041	if (IdentifierStr == "if") return tok_if;
				1042	if (IdentifierStr == "then") return tok_then;
				1043	if (IdentifierStr == "else") return tok_else;
				1044	if (IdentifierStr == "for") return tok_for;
				1045	if (IdentifierStr == "in") return tok_in;
				1046	if (IdentifierStr == "binary") return tok_binary;
				1047	if (IdentifierStr == "unary") return tok_unary;
				1048	if (IdentifierStr == "var") return tok_var;
				1049	return tok_identifier;
				1050	}
				1051
				1052	if (isdigit(LastChar) \|\| LastChar == '.') { // Number: [0-9.]+
				1053	std::string NumStr;
				1054	do {
				1055	NumStr += LastChar;
				1056	LastChar = getchar();
				1057	} while (isdigit(LastChar) \|\| LastChar == '.');
				1058
				1059	NumVal = strtod(NumStr.c_str(), 0);
				1060	return tok_number;
				1061	}
				1062
				1063	if (LastChar == '#') {
				1064	// Comment until end of line.
				1065	do LastChar = getchar();
				1066	while (LastChar != EOF && LastChar != '\n' & LastChar != '\r');
				1067
				1068	if (LastChar != EOF)
				1069	return gettok();
				1070	}
				1071
				1072	// Check for end of file. Don't eat the EOF.
				1073	if (LastChar == EOF)
				1074	return tok_eof;
				1075
				1076	// Otherwise, just return the character as its ascii value.
				1077	int ThisChar = LastChar;
				1078	LastChar = getchar();
				1079	return ThisChar;
				1080	}
				1081
				1082	//===----------------------------------------------------------------------===//
				1083	// Abstract Syntax Tree (aka Parse Tree)
				1084	//===----------------------------------------------------------------------===//
				1085
				1086	/// ExprAST - Base class for all expression nodes.
				1087	class ExprAST {
				1088	public:
				1089	virtual ~ExprAST() {}
				1090	virtual Value *Codegen() = 0;
				1091	};
				1092
				1093	/// NumberExprAST - Expression class for numeric literals like "1.0".
				1094	class NumberExprAST : public ExprAST {
				1095	double Val;
				1096	public:
				1097	NumberExprAST(double val) : Val(val) {}
				1098	virtual Value *Codegen();
				1099	};
				1100
				1101	/// VariableExprAST - Expression class for referencing a variable, like "a".
				1102	class VariableExprAST : public ExprAST {
				1103	std::string Name;
				1104	public:
				1105	VariableExprAST(const std::string &name) : Name(name) {}
				1106	const std::string &getName() const { return Name; }
				1107	virtual Value *Codegen();
				1108	};
				1109
				1110	/// UnaryExprAST - Expression class for a unary operator.
				1111	class UnaryExprAST : public ExprAST {
				1112	char Opcode;
				1113	ExprAST *Operand;
				1114	public:
				1115	UnaryExprAST(char opcode, ExprAST *operand)
				1116	: Opcode(opcode), Operand(operand) {}
				1117	virtual Value *Codegen();
				1118	};
				1119
				1120	/// BinaryExprAST - Expression class for a binary operator.
				1121	class BinaryExprAST : public ExprAST {
				1122	char Op;
				1123	ExprAST LHS, RHS;
				1124	public:
				1125	BinaryExprAST(char op, ExprAST lhs, ExprAST rhs)
				1126	: Op(op), LHS(lhs), RHS(rhs) {}
				1127	virtual Value *Codegen();
				1128	};
				1129
				1130	/// CallExprAST - Expression class for function calls.
				1131	class CallExprAST : public ExprAST {
				1132	std::string Callee;
				1133	std::vector<ExprAST*> Args;
				1134	public:
				1135	CallExprAST(const std::string &callee, std::vector<ExprAST*> &args)
				1136	: Callee(callee), Args(args) {}
				1137	virtual Value *Codegen();
				1138	};
				1139
				1140	/// IfExprAST - Expression class for if/then/else.
				1141	class IfExprAST : public ExprAST {
				1142	ExprAST Cond, Then, *Else;
				1143	public:
				1144	IfExprAST(ExprAST cond, ExprAST then, ExprAST *_else)
				1145	: Cond(cond), Then(then), Else(_else) {}
				1146	virtual Value *Codegen();
				1147	};
				1148
				1149	/// ForExprAST - Expression class for for/in.
				1150	class ForExprAST : public ExprAST {
				1151	std::string VarName;
				1152	ExprAST Start, End, Step, Body;
				1153	public:
				1154	ForExprAST(const std::string &varname, ExprAST start, ExprAST end,
				1155	ExprAST step, ExprAST body)
				1156	: VarName(varname), Start(start), End(end), Step(step), Body(body) {}
				1157	virtual Value *Codegen();
				1158	};
				1159
				1160	/// VarExprAST - Expression class for var/in
				1161	class VarExprAST : public ExprAST {
				1162	std::vector<std::pair<std::string, ExprAST*> > VarNames;
				1163	ExprAST *Body;
				1164	public:
				1165	VarExprAST(const std::vector<std::pair<std::string, ExprAST*> > &varnames,
				1166	ExprAST *body)
				1167	: VarNames(varnames), Body(body) {}
				1168
				1169	virtual Value *Codegen();
				1170	};
				1171
				1172	/// PrototypeAST - This class represents the "prototype" for a function,
				1173	/// which captures its argument names as well as if it is an operator.
				1174	class PrototypeAST {
				1175	std::string Name;
				1176	std::vector<std::string> Args;
				1177	bool isOperator;
				1178	unsigned Precedence; // Precedence if a binary op.
				1179	public:
				1180	PrototypeAST(const std::string &name, const std::vector<std::string> &args,
				1181	bool isoperator = false, unsigned prec = 0)
				1182	: Name(name), Args(args), isOperator(isoperator), Precedence(prec) {}
				1183
				1184	bool isUnaryOp() const { return isOperator && Args.size() == 1; }
				1185	bool isBinaryOp() const { return isOperator && Args.size() == 2; }
				1186
				1187	char getOperatorName() const {
				1188	assert(isUnaryOp() \|\| isBinaryOp());
				1189	return Name[Name.size()-1];
				1190	}
				1191
				1192	unsigned getBinaryPrecedence() const { return Precedence; }
				1193
				1194	Function *Codegen();
				1195
				1196	void CreateArgumentAllocas(Function *F);
				1197	};
				1198
				1199	/// FunctionAST - This class represents a function definition itself.
				1200	class FunctionAST {
				1201	PrototypeAST *Proto;
				1202	ExprAST *Body;
				1203	public:
				1204	FunctionAST(PrototypeAST proto, ExprAST body)
				1205	: Proto(proto), Body(body) {}
				1206
				1207	Function *Codegen();
				1208	};
				1209
				1210	//===----------------------------------------------------------------------===//
				1211	// Parser
				1212	//===----------------------------------------------------------------------===//
				1213
				1214	/// CurTok/getNextToken - Provide a simple token buffer. CurTok is the current
				1215	/// token the parser it looking at. getNextToken reads another token from the
				1216	/// lexer and updates CurTok with its results.
				1217	static int CurTok;
				1218	static int getNextToken() {
				1219	return CurTok = gettok();
				1220	}
				1221
				1222	/// BinopPrecedence - This holds the precedence for each binary operator that is
				1223	/// defined.
				1224	static std::map<char, int> BinopPrecedence;
				1225
				1226	/// GetTokPrecedence - Get the precedence of the pending binary operator token.
				1227	static int GetTokPrecedence() {
				1228	if (!isascii(CurTok))
				1229	return -1;
				1230
				1231	// Make sure it's a declared binop.
				1232	int TokPrec = BinopPrecedence[CurTok];
				1233	if (TokPrec <= 0) return -1;
				1234	return TokPrec;
				1235	}
				1236
				1237	/// Error* - These are little helper functions for error handling.
				1238	ExprAST Error(const char Str) { fprintf(stderr, "Error: %s\n", Str);return 0;}
				1239	PrototypeAST ErrorP(const char Str) { Error(Str); return 0; }
				1240	FunctionAST ErrorF(const char Str) { Error(Str); return 0; }
				1241
				1242	static ExprAST *ParseExpression();
				1243
				1244	/// identifierexpr
				1245	/// ::= identifer
				1246	/// ::= identifer '(' expression* ')'
				1247	static ExprAST *ParseIdentifierExpr() {
				1248	std::string IdName = IdentifierStr;
				1249
				1250	getNextToken(); // eat identifer.
				1251
				1252	if (CurTok != '(') // Simple variable ref.
				1253	return new VariableExprAST(IdName);
				1254
				1255	// Call.
				1256	getNextToken(); // eat (
				1257	std::vector<ExprAST*> Args;
				1258	if (CurTok != ')') {
				1259	while (1) {
				1260	ExprAST *Arg = ParseExpression();
				1261	if (!Arg) return 0;
				1262	Args.push_back(Arg);
				1263
				1264	if (CurTok == ')') break;
				1265
				1266	if (CurTok != ',')
				1267	return Error("Expected ')'");
				1268	getNextToken();
				1269	}
				1270	}
				1271
				1272	// Eat the ')'.
				1273	getNextToken();
				1274
				1275	return new CallExprAST(IdName, Args);
				1276	}
				1277
				1278	/// numberexpr ::= number
				1279	static ExprAST *ParseNumberExpr() {
				1280	ExprAST *Result = new NumberExprAST(NumVal);
				1281	getNextToken(); // consume the number
				1282	return Result;
				1283	}
				1284
				1285	/// parenexpr ::= '(' expression ')'
				1286	static ExprAST *ParseParenExpr() {
				1287	getNextToken(); // eat (.
				1288	ExprAST *V = ParseExpression();
				1289	if (!V) return 0;
				1290
				1291	if (CurTok != ')')
				1292	return Error("expected ')'");
				1293	getNextToken(); // eat ).
				1294	return V;
				1295	}
				1296
				1297	/// ifexpr ::= 'if' expression 'then' expression 'else' expression
				1298	static ExprAST *ParseIfExpr() {
				1299	getNextToken(); // eat the if.
				1300
				1301	// condition.
				1302	ExprAST *Cond = ParseExpression();
				1303	if (!Cond) return 0;
				1304
				1305	if (CurTok != tok_then)
				1306	return Error("expected then");
				1307	getNextToken(); // eat the then
				1308
				1309	ExprAST *Then = ParseExpression();
				1310	if (Then == 0) return 0;
				1311
				1312	if (CurTok != tok_else)
				1313	return Error("expected else");
				1314
				1315	getNextToken();
				1316
				1317	ExprAST *Else = ParseExpression();
				1318	if (!Else) return 0;
				1319
				1320	return new IfExprAST(Cond, Then, Else);
				1321	}
				1322
				1323	/// forexpr ::= 'for' identifer '=' expr ',' expr (',' expr)? 'in' expression
				1324	static ExprAST *ParseForExpr() {
				1325	getNextToken(); // eat the for.
				1326
				1327	if (CurTok != tok_identifier)
				1328	return Error("expected identifier after for");
				1329
				1330	std::string IdName = IdentifierStr;
				1331	getNextToken(); // eat identifer.
				1332
				1333	if (CurTok != '=')
				1334	return Error("expected '=' after for");
				1335	getNextToken(); // eat '='.
				1336
				1337
				1338	ExprAST *Start = ParseExpression();
				1339	if (Start == 0) return 0;
				1340	if (CurTok != ',')
				1341	return Error("expected ',' after for start value");
				1342	getNextToken();
				1343
				1344	ExprAST *End = ParseExpression();
				1345	if (End == 0) return 0;
				1346
				1347	// The step value is optional.
				1348	ExprAST *Step = 0;
				1349	if (CurTok == ',') {
				1350	getNextToken();
				1351	Step = ParseExpression();
				1352	if (Step == 0) return 0;
				1353	}
				1354
				1355	if (CurTok != tok_in)
				1356	return Error("expected 'in' after for");
				1357	getNextToken(); // eat 'in'.
				1358
				1359	ExprAST *Body = ParseExpression();
				1360	if (Body == 0) return 0;
				1361
				1362	return new ForExprAST(IdName, Start, End, Step, Body);
				1363	}
				1364
				1365	/// varexpr ::= 'var' identifer ('=' expression)?
				1366	// (',' identifer ('=' expression)?)* 'in' expression
				1367	static ExprAST *ParseVarExpr() {
				1368	getNextToken(); // eat the var.
				1369
				1370	std::vector<std::pair<std::string, ExprAST*> > VarNames;
				1371
				1372	// At least one variable name is required.
				1373	if (CurTok != tok_identifier)
				1374	return Error("expected identifier after var");
				1375
				1376	while (1) {
				1377	std::string Name = IdentifierStr;
				1378	getNextToken(); // eat identifer.
				1379
				1380	// Read the optional initializer.
				1381	ExprAST *Init = 0;
				1382	if (CurTok == '=') {
				1383	getNextToken(); // eat the '='.
				1384
				1385	Init = ParseExpression();
				1386	if (Init == 0) return 0;
				1387	}
				1388
				1389	VarNames.push_back(std::make_pair(Name, Init));
				1390
				1391	// End of var list, exit loop.
				1392	if (CurTok != ',') break;
				1393	getNextToken(); // eat the ','.
				1394
				1395	if (CurTok != tok_identifier)
				1396	return Error("expected identifier list after var");
				1397	}
				1398
				1399	// At this point, we have to have 'in'.
				1400	if (CurTok != tok_in)
				1401	return Error("expected 'in' keyword after 'var'");
				1402	getNextToken(); // eat 'in'.
				1403
				1404	ExprAST *Body = ParseExpression();
				1405	if (Body == 0) return 0;
				1406
				1407	return new VarExprAST(VarNames, Body);
				1408	}
				1409
				1410
				1411	/// primary
				1412	/// ::= identifierexpr
				1413	/// ::= numberexpr
				1414	/// ::= parenexpr
				1415	/// ::= ifexpr
				1416	/// ::= forexpr
				1417	/// ::= varexpr
				1418	static ExprAST *ParsePrimary() {
				1419	switch (CurTok) {
				1420	default: return Error("unknown token when expecting an expression");
				1421	case tok_identifier: return ParseIdentifierExpr();
				1422	case tok_number: return ParseNumberExpr();
				1423	case '(': return ParseParenExpr();
				1424	case tok_if: return ParseIfExpr();
				1425	case tok_for: return ParseForExpr();
				1426	case tok_var: return ParseVarExpr();
				1427	}
				1428	}
				1429
				1430	/// unary
				1431	/// ::= primary
				1432	/// ::= '!' unary
				1433	static ExprAST *ParseUnary() {
				1434	// If the current token is not an operator, it must be a primary expr.
				1435	if (!isascii(CurTok) \|\| CurTok == '(' \|\| CurTok == ',')
				1436	return ParsePrimary();
				1437
				1438	// If this is a unary operator, read it.
				1439	int Opc = CurTok;
				1440	getNextToken();
				1441	if (ExprAST *Operand = ParseUnary())
				1442	return new UnaryExprAST(Opc, Operand);
				1443	return 0;
				1444	}
				1445
				1446	/// binoprhs
				1447	/// ::= ('+' unary)*
				1448	static ExprAST ParseBinOpRHS(int ExprPrec, ExprAST LHS) {
				1449	// If this is a binop, find its precedence.
				1450	while (1) {
				1451	int TokPrec = GetTokPrecedence();
				1452
				1453	// If this is a binop that binds at least as tightly as the current binop,
				1454	// consume it, otherwise we are done.
				1455	if (TokPrec < ExprPrec)
				1456	return LHS;
				1457
				1458	// Okay, we know this is a binop.
				1459	int BinOp = CurTok;
				1460	getNextToken(); // eat binop
				1461
				1462	// Parse the unary expression after the binary operator.
				1463	ExprAST *RHS = ParseUnary();
				1464	if (!RHS) return 0;
				1465
				1466	// If BinOp binds less tightly with RHS than the operator after RHS, let
				1467	// the pending operator take RHS as its LHS.
				1468	int NextPrec = GetTokPrecedence();
				1469	if (TokPrec < NextPrec) {
				1470	RHS = ParseBinOpRHS(TokPrec+1, RHS);
				1471	if (RHS == 0) return 0;
				1472	}
				1473
				1474	// Merge LHS/RHS.
				1475	LHS = new BinaryExprAST(BinOp, LHS, RHS);
				1476	}
				1477	}
				1478
				1479	/// expression
				1480	/// ::= unary binoprhs
				1481	///
				1482	static ExprAST *ParseExpression() {
				1483	ExprAST *LHS = ParseUnary();
				1484	if (!LHS) return 0;
				1485
				1486	return ParseBinOpRHS(0, LHS);
				1487	}
				1488
				1489	/// prototype
				1490	/// ::= id '(' id* ')'
				1491	/// ::= binary LETTER number? (id, id)
				1492	/// ::= unary LETTER (id)
				1493	static PrototypeAST *ParsePrototype() {
				1494	std::string FnName;
				1495
				1496	int Kind = 0; // 0 = identifier, 1 = unary, 2 = binary.
				1497	unsigned BinaryPrecedence = 30;
				1498
				1499	switch (CurTok) {
				1500	default:
				1501	return ErrorP("Expected function name in prototype");
				1502	case tok_identifier:
				1503	FnName = IdentifierStr;
				1504	Kind = 0;
				1505	getNextToken();
				1506	break;
				1507	case tok_unary:
				1508	getNextToken();
				1509	if (!isascii(CurTok))
				1510	return ErrorP("Expected unary operator");
				1511	FnName = "unary";
				1512	FnName += (char)CurTok;
				1513	Kind = 1;
				1514	getNextToken();
				1515	break;
				1516	case tok_binary:
				1517	getNextToken();
				1518	if (!isascii(CurTok))
				1519	return ErrorP("Expected binary operator");
				1520	FnName = "binary";
				1521	FnName += (char)CurTok;
				1522	Kind = 2;
				1523	getNextToken();
				1524
				1525	// Read the precedence if present.
				1526	if (CurTok == tok_number) {
				1527	if (NumVal < 1 \|\| NumVal > 100)
				1528	return ErrorP("Invalid precedecnce: must be 1..100");
				1529	BinaryPrecedence = (unsigned)NumVal;
				1530	getNextToken();
				1531	}
				1532	break;
				1533	}
				1534
				1535	if (CurTok != '(')
				1536	return ErrorP("Expected '(' in prototype");
				1537
				1538	std::vector<std::string> ArgNames;
				1539	while (getNextToken() == tok_identifier)
				1540	ArgNames.push_back(IdentifierStr);
				1541	if (CurTok != ')')
				1542	return ErrorP("Expected ')' in prototype");
				1543
				1544	// success.
				1545	getNextToken(); // eat ')'.
				1546
				1547	// Verify right number of names for operator.
				1548	if (Kind && ArgNames.size() != Kind)
				1549	return ErrorP("Invalid number of operands for operator");
				1550
				1551	return new PrototypeAST(FnName, ArgNames, Kind != 0, BinaryPrecedence);
				1552	}
				1553
				1554	/// definition ::= 'def' prototype expression
				1555	static FunctionAST *ParseDefinition() {
				1556	getNextToken(); // eat def.
				1557	PrototypeAST *Proto = ParsePrototype();
				1558	if (Proto == 0) return 0;
				1559
				1560	if (ExprAST *E = ParseExpression())
				1561	return new FunctionAST(Proto, E);
				1562	return 0;
				1563	}
				1564
				1565	/// toplevelexpr ::= expression
				1566	static FunctionAST *ParseTopLevelExpr() {
				1567	if (ExprAST *E = ParseExpression()) {
				1568	// Make an anonymous proto.
				1569	PrototypeAST *Proto = new PrototypeAST("", std::vector<std::string>());
				1570	return new FunctionAST(Proto, E);
				1571	}
				1572	return 0;
				1573	}
				1574
				1575	/// external ::= 'extern' prototype
				1576	static PrototypeAST *ParseExtern() {
				1577	getNextToken(); // eat extern.
				1578	return ParsePrototype();
				1579	}
				1580
				1581	//===----------------------------------------------------------------------===//
				1582	// Code Generation
				1583	//===----------------------------------------------------------------------===//
				1584
				1585	static Module *TheModule;
				1586	static LLVMFoldingBuilder Builder;
				1587	static std::map<std::string, AllocaInst*> NamedValues;
				1588	static FunctionPassManager *TheFPM;
				1589
				1590	Value ErrorV(const char Str) { Error(Str); return 0; }
				1591
				1592	/// CreateEntryBlockAlloca - Create an alloca instruction in the entry block of
				1593	/// the function. This is used for mutable variables etc.
				1594	static AllocaInst CreateEntryBlockAlloca(Function TheFunction,
				1595	const std::string &VarName) {
				1596	LLVMBuilder TmpB(&TheFunction->getEntryBlock(),
				1597	TheFunction->getEntryBlock().begin());
				1598	return TmpB.CreateAlloca(Type::DoubleTy, 0, VarName.c_str());
				1599	}
				1600
				1601
				1602	Value *NumberExprAST::Codegen() {
				1603	return ConstantFP::get(Type::DoubleTy, APFloat(Val));
				1604	}
				1605
				1606	Value *VariableExprAST::Codegen() {
				1607	// Look this variable up in the function.
				1608	Value *V = NamedValues[Name];
				1609	if (V == 0) return ErrorV("Unknown variable name");
				1610
				1611	// Load the value.
				1612	return Builder.CreateLoad(V, Name.c_str());
				1613	}
				1614
				1615	Value *UnaryExprAST::Codegen() {
				1616	Value *OperandV = Operand->Codegen();
				1617	if (OperandV == 0) return 0;
				1618
				1619	Function *F = TheModule->getFunction(std::string("unary")+Opcode);
				1620	if (F == 0)
				1621	return ErrorV("Unknown unary operator");
				1622
				1623	return Builder.CreateCall(F, OperandV, "unop");
				1624	}
				1625
				1626
				1627	Value *BinaryExprAST::Codegen() {
				1628	// Special case '=' because we don't want to emit the LHS as an expression.
				1629	if (Op == '=') {
				1630	// Assignment requires the LHS to be an identifier.
				1631	VariableExprAST LHSE = dynamic_cast<VariableExprAST>(LHS);
				1632	if (!LHSE)
				1633	return ErrorV("destination of '=' must be a variable");
				1634	// Codegen the RHS.
				1635	Value *Val = RHS->Codegen();
				1636	if (Val == 0) return 0;
				1637
				1638	// Look up the name.
				1639	Value *Variable = NamedValues[LHSE->getName()];
				1640	if (Variable == 0) return ErrorV("Unknown variable name");
				1641
				1642	Builder.CreateStore(Val, Variable);
				1643	return Val;
				1644	}
				1645
				1646
				1647	Value *L = LHS->Codegen();
				1648	Value *R = RHS->Codegen();
				1649	if (L == 0 \|\| R == 0) return 0;
				1650
				1651	switch (Op) {
				1652	case '+': return Builder.CreateAdd(L, R, "addtmp");
				1653	case '-': return Builder.CreateSub(L, R, "subtmp");
				1654	case '*': return Builder.CreateMul(L, R, "multmp");
				1655	case '<':
				1656	L = Builder.CreateFCmpULT(L, R, "multmp");
				1657	// Convert bool 0/1 to double 0.0 or 1.0
				1658	return Builder.CreateUIToFP(L, Type::DoubleTy, "booltmp");
				1659	default: break;
				1660	}
				1661
				1662	// If it wasn't a builtin binary operator, it must be a user defined one. Emit
				1663	// a call to it.
				1664	Function *F = TheModule->getFunction(std::string("binary")+Op);
				1665	assert(F && "binary operator not found!");
				1666
				1667	Value *Ops[] = { L, R };
				1668	return Builder.CreateCall(F, Ops, Ops+2, "binop");
				1669	}
				1670
				1671	Value *CallExprAST::Codegen() {
				1672	// Look up the name in the global module table.
				1673	Function *CalleeF = TheModule->getFunction(Callee);
				1674	if (CalleeF == 0)
				1675	return ErrorV("Unknown function referenced");
				1676
				1677	// If argument mismatch error.
				1678	if (CalleeF->arg_size() != Args.size())
				1679	return ErrorV("Incorrect # arguments passed");
				1680
				1681	std::vector<Value*> ArgsV;
				1682	for (unsigned i = 0, e = Args.size(); i != e; ++i) {
				1683	ArgsV.push_back(Args[i]->Codegen());
				1684	if (ArgsV.back() == 0) return 0;
				1685	}
				1686
				1687	return Builder.CreateCall(CalleeF, ArgsV.begin(), ArgsV.end(), "calltmp");
				1688	}
				1689
				1690	Value *IfExprAST::Codegen() {
				1691	Value *CondV = Cond->Codegen();
				1692	if (CondV == 0) return 0;
				1693
				1694	// Convert condition to a bool by comparing equal to 0.0.
				1695	CondV = Builder.CreateFCmpONE(CondV,
				1696	ConstantFP::get(Type::DoubleTy, APFloat(0.0)),
				1697	"ifcond");
				1698
				1699	Function *TheFunction = Builder.GetInsertBlock()->getParent();
				1700
				1701	// Create blocks for the then and else cases. Insert the 'then' block at the
				1702	// end of the function.
				1703	BasicBlock *ThenBB = new BasicBlock("then", TheFunction);
				1704	BasicBlock *ElseBB = new BasicBlock("else");
				1705	BasicBlock *MergeBB = new BasicBlock("ifcont");
				1706
				1707	Builder.CreateCondBr(CondV, ThenBB, ElseBB);
				1708
				1709	// Emit then value.
				1710	Builder.SetInsertPoint(ThenBB);
				1711
				1712	Value *ThenV = Then->Codegen();
				1713	if (ThenV == 0) return 0;
				1714
				1715	Builder.CreateBr(MergeBB);
				1716	// Codegen of 'Then' can change the current block, update ThenBB for the PHI.
				1717	ThenBB = Builder.GetInsertBlock();
				1718
				1719	// Emit else block.
				1720	TheFunction->getBasicBlockList().push_back(ElseBB);
				1721	Builder.SetInsertPoint(ElseBB);
				1722
				1723	Value *ElseV = Else->Codegen();
				1724	if (ElseV == 0) return 0;
				1725
				1726	Builder.CreateBr(MergeBB);
				1727	// Codegen of 'Else' can change the current block, update ElseBB for the PHI.
				1728	ElseBB = Builder.GetInsertBlock();
				1729
				1730	// Emit merge block.
				1731	TheFunction->getBasicBlockList().push_back(MergeBB);
				1732	Builder.SetInsertPoint(MergeBB);
				1733	PHINode *PN = Builder.CreatePHI(Type::DoubleTy, "iftmp");
				1734
				1735	PN->addIncoming(ThenV, ThenBB);
				1736	PN->addIncoming(ElseV, ElseBB);
				1737	return PN;
				1738	}
				1739
				1740	Value *ForExprAST::Codegen() {
				1741	// Output this as:
				1742	// var = alloca double
				1743	// ...
				1744	// start = startexpr
				1745	// store start -> var
				1746	// goto loop
				1747	// loop:
				1748	// ...
				1749	// bodyexpr
				1750	// ...
				1751	// loopend:
				1752	// step = stepexpr
				1753	// endcond = endexpr
				1754	//
				1755	// curvar = load var
				1756	// nextvar = curvar + step
				1757	// store nextvar -> var
				1758	// br endcond, loop, endloop
				1759	// outloop:
				1760
				1761	Function *TheFunction = Builder.GetInsertBlock()->getParent();
				1762
				1763	// Create an alloca for the variable in the entry block.
				1764	AllocaInst *Alloca = CreateEntryBlockAlloca(TheFunction, VarName);
				1765
				1766	// Emit the start code first, without 'variable' in scope.
				1767	Value *StartVal = Start->Codegen();
				1768	if (StartVal == 0) return 0;
				1769
				1770	// Store the value into the alloca.
				1771	Builder.CreateStore(StartVal, Alloca);
				1772
				1773	// Make the new basic block for the loop header, inserting after current
				1774	// block.
				1775	BasicBlock *PreheaderBB = Builder.GetInsertBlock();
				1776	BasicBlock *LoopBB = new BasicBlock("loop", TheFunction);
				1777
				1778	// Insert an explicit fall through from the current block to the LoopBB.
				1779	Builder.CreateBr(LoopBB);
				1780
				1781	// Start insertion in LoopBB.
				1782	Builder.SetInsertPoint(LoopBB);
				1783
				1784	// Within the loop, the variable is defined equal to the PHI node. If it
				1785	// shadows an existing variable, we have to restore it, so save it now.
				1786	AllocaInst *OldVal = NamedValues[VarName];
				1787	NamedValues[VarName] = Alloca;
				1788
				1789	// Emit the body of the loop. This, like any other expr, can change the
				1790	// current BB. Note that we ignore the value computed by the body, but don't
				1791	// allow an error.
				1792	if (Body->Codegen() == 0)
				1793	return 0;
				1794
				1795	// Emit the step value.
				1796	Value *StepVal;
				1797	if (Step) {
				1798	StepVal = Step->Codegen();
				1799	if (StepVal == 0) return 0;
				1800	} else {
				1801	// If not specified, use 1.0.
				1802	StepVal = ConstantFP::get(Type::DoubleTy, APFloat(1.0));
				1803	}
				1804
				1805	// Compute the end condition.
				1806	Value *EndCond = End->Codegen();
				1807	if (EndCond == 0) return EndCond;
				1808
				1809	// Reload, increment, and restore the alloca. This handles the case where
				1810	// the body of the loop mutates the variable.
				1811	Value *CurVar = Builder.CreateLoad(Alloca, VarName.c_str());
				1812	Value *NextVar = Builder.CreateAdd(CurVar, StepVal, "nextvar");
				1813	Builder.CreateStore(NextVar, Alloca);
				1814
				1815	// Convert condition to a bool by comparing equal to 0.0.
				1816	EndCond = Builder.CreateFCmpONE(EndCond,
				1817	ConstantFP::get(Type::DoubleTy, APFloat(0.0)),
				1818	"loopcond");
				1819
				1820	// Create the "after loop" block and insert it.
				1821	BasicBlock *LoopEndBB = Builder.GetInsertBlock();
				1822	BasicBlock *AfterBB = new BasicBlock("afterloop", TheFunction);
				1823
				1824	// Insert the conditional branch into the end of LoopEndBB.
				1825	Builder.CreateCondBr(EndCond, LoopBB, AfterBB);
				1826
				1827	// Any new code will be inserted in AfterBB.
				1828	Builder.SetInsertPoint(AfterBB);
				1829
				1830	// Restore the unshadowed variable.
				1831	if (OldVal)
				1832	NamedValues[VarName] = OldVal;
				1833	else
				1834	NamedValues.erase(VarName);
				1835
				1836
				1837	// for expr always returns 0.0.
				1838	return Constant::getNullValue(Type::DoubleTy);
				1839	}
				1840
				1841	Value *VarExprAST::Codegen() {
				1842	std::vector<AllocaInst *> OldBindings;
				1843
				1844	Function *TheFunction = Builder.GetInsertBlock()->getParent();
				1845
				1846	// Register all variables and emit their initializer.
				1847	for (unsigned i = 0, e = VarNames.size(); i != e; ++i) {
				1848	const std::string &VarName = VarNames[i].first;
				1849	ExprAST *Init = VarNames[i].second;
				1850
				1851	// Emit the initializer before adding the variable to scope, this prevents
				1852	// the initializer from referencing the variable itself, and permits stuff
				1853	// like this:
				1854	// var a = 1 in
				1855	// var a = a in ... # refers to outer 'a'.
				1856	Value *InitVal;
				1857	if (Init) {
				1858	InitVal = Init->Codegen();
				1859	if (InitVal == 0) return 0;
				1860	} else { // If not specified, use 0.0.
				1861	InitVal = ConstantFP::get(Type::DoubleTy, APFloat(0.0));
				1862	}
				1863
				1864	AllocaInst *Alloca = CreateEntryBlockAlloca(TheFunction, VarName);
				1865	Builder.CreateStore(InitVal, Alloca);
				1866
				1867	// Remember the old variable binding so that we can restore the binding when
				1868	// we unrecurse.
				1869	OldBindings.push_back(NamedValues[VarName]);
				1870
				1871	// Remember this binding.
				1872	NamedValues[VarName] = Alloca;
				1873	}
				1874
				1875	// Codegen the body, now that all vars are in scope.
				1876	Value *BodyVal = Body->Codegen();
				1877	if (BodyVal == 0) return 0;
				1878
				1879	// Pop all our variables from scope.
				1880	for (unsigned i = 0, e = VarNames.size(); i != e; ++i)
				1881	NamedValues[VarNames[i].first] = OldBindings[i];
				1882
				1883	// Return the body computation.
				1884	return BodyVal;
				1885	}
				1886
				1887
				1888	Function *PrototypeAST::Codegen() {
				1889	// Make the function type: double(double,double) etc.
				1890	std::vector<const Type*> Doubles(Args.size(), Type::DoubleTy);
				1891	FunctionType *FT = FunctionType::get(Type::DoubleTy, Doubles, false);
				1892
				1893	Function *F = new Function(FT, Function::ExternalLinkage, Name, TheModule);
				1894
				1895	// If F conflicted, there was already something named 'Name'. If it has a
				1896	// body, don't allow redefinition or reextern.
				1897	if (F->getName() != Name) {
				1898	// Delete the one we just made and get the existing one.
				1899	F->eraseFromParent();
				1900	F = TheModule->getFunction(Name);
				1901
				1902	// If F already has a body, reject this.
				1903	if (!F->empty()) {
				1904	ErrorF("redefinition of function");
				1905	return 0;
				1906	}
				1907
				1908	// If F took a different number of args, reject.
				1909	if (F->arg_size() != Args.size()) {
				1910	ErrorF("redefinition of function with different # args");
				1911	return 0;
				1912	}
				1913	}
				1914
				1915	// Set names for all arguments.
				1916	unsigned Idx = 0;
				1917	for (Function::arg_iterator AI = F->arg_begin(); Idx != Args.size();
				1918	++AI, ++Idx)
				1919	AI->setName(Args[Idx]);
				1920
				1921	return F;
				1922	}
				1923
				1924	/// CreateArgumentAllocas - Create an alloca for each argument and register the
				1925	/// argument in the symbol table so that references to it will succeed.
				1926	void PrototypeAST::CreateArgumentAllocas(Function *F) {
				1927	Function::arg_iterator AI = F->arg_begin();
				1928	for (unsigned Idx = 0, e = Args.size(); Idx != e; ++Idx, ++AI) {
				1929	// Create an alloca for this variable.
				1930	AllocaInst *Alloca = CreateEntryBlockAlloca(F, Args[Idx]);
				1931
				1932	// Store the initial value into the alloca.
				1933	Builder.CreateStore(AI, Alloca);
				1934
				1935	// Add arguments to variable symbol table.
				1936	NamedValues[Args[Idx]] = Alloca;
				1937	}
				1938	}
				1939
				1940
				1941	Function *FunctionAST::Codegen() {
				1942	NamedValues.clear();
				1943
				1944	Function *TheFunction = Proto->Codegen();
				1945	if (TheFunction == 0)
				1946	return 0;
				1947
				1948	// If this is an operator, install it.
				1949	if (Proto->isBinaryOp())
				1950	BinopPrecedence[Proto->getOperatorName()] = Proto->getBinaryPrecedence();
				1951
				1952	// Create a new basic block to start insertion into.
				1953	BasicBlock *BB = new BasicBlock("entry", TheFunction);
				1954	Builder.SetInsertPoint(BB);
				1955
				1956	// Add all arguments to the symbol table and create their allocas.
				1957	Proto->CreateArgumentAllocas(TheFunction);
				1958
				1959	if (Value *RetVal = Body->Codegen()) {
				1960	// Finish off the function.
				1961	Builder.CreateRet(RetVal);
				1962
				1963	// Validate the generated code, checking for consistency.
				1964	verifyFunction(*TheFunction);
				1965
				1966	// Optimize the function.
				1967	TheFPM->run(*TheFunction);
				1968
				1969	return TheFunction;
				1970	}
				1971
				1972	// Error reading body, remove function.
				1973	TheFunction->eraseFromParent();
				1974
				1975	if (Proto->isBinaryOp())
				1976	BinopPrecedence.erase(Proto->getOperatorName());
				1977	return 0;
				1978	}
				1979
				1980	//===----------------------------------------------------------------------===//
				1981	// Top-Level parsing and JIT Driver
				1982	//===----------------------------------------------------------------------===//
				1983
				1984	static ExecutionEngine *TheExecutionEngine;
				1985
				1986	static void HandleDefinition() {
				1987	if (FunctionAST *F = ParseDefinition()) {
				1988	if (Function *LF = F->Codegen()) {
				1989	fprintf(stderr, "Read function definition:");
				1990	LF->dump();
				1991	}
				1992	} else {
				1993	// Skip token for error recovery.
				1994	getNextToken();
				1995	}
				1996	}
				1997
				1998	static void HandleExtern() {
				1999	if (PrototypeAST *P = ParseExtern()) {
				2000	if (Function *F = P->Codegen()) {
				2001	fprintf(stderr, "Read extern: ");
				2002	F->dump();
				2003	}
				2004	} else {
				2005	// Skip token for error recovery.
				2006	getNextToken();
				2007	}
				2008	}
				2009
				2010	static void HandleTopLevelExpression() {
				2011	// Evaluate a top level expression into an anonymous function.
				2012	if (FunctionAST *F = ParseTopLevelExpr()) {
				2013	if (Function *LF = F->Codegen()) {
				2014	// JIT the function, returning a function pointer.
				2015	void *FPtr = TheExecutionEngine->getPointerToFunction(LF);
				2016
				2017	// Cast it to the right type (takes no arguments, returns a double) so we
				2018	// can call it as a native function.
				2019	double (FP)() = (double ()())FPtr;
				2020	fprintf(stderr, "Evaluated to %f\n", FP());
				2021	}
				2022	} else {
				2023	// Skip token for error recovery.
				2024	getNextToken();
				2025	}
				2026	}
				2027
				2028	/// top ::= definition \| external \| expression \| ';'
				2029	static void MainLoop() {
				2030	while (1) {
				2031	fprintf(stderr, "ready> ");
				2032	switch (CurTok) {
				2033	case tok_eof: return;
				2034	case ';': getNextToken(); break; // ignore top level semicolons.
				2035	case tok_def: HandleDefinition(); break;
				2036	case tok_extern: HandleExtern(); break;
				2037	default: HandleTopLevelExpression(); break;
				2038	}
				2039	}
				2040	}
				2041
				2042
				2043
				2044	//===----------------------------------------------------------------------===//
				2045	// "Library" functions that can be "extern'd" from user code.
				2046	//===----------------------------------------------------------------------===//
				2047
				2048	/// putchard - putchar that takes a double and returns 0.
				2049	extern "C"
				2050	double putchard(double X) {
				2051	putchar((char)X);
				2052	return 0;
				2053	}
				2054
				2055	/// printd - printf that takes a double prints it as "%f\n", returning 0.
				2056	extern "C"
				2057	double printd(double X) {
				2058	printf("%f\n", X);
				2059	return 0;
				2060	}
				2061
				2062	//===----------------------------------------------------------------------===//
				2063	// Main driver code.
				2064	//===----------------------------------------------------------------------===//
				2065
				2066	int main() {
				2067	// Install standard binary operators.
				2068	// 1 is lowest precedence.
				2069	BinopPrecedence['='] = 2;
				2070	BinopPrecedence['<'] = 10;
				2071	BinopPrecedence['+'] = 20;
				2072	BinopPrecedence['-'] = 20;
				2073	BinopPrecedence['*'] = 40; // highest.
				2074
				2075	// Prime the first token.
				2076	fprintf(stderr, "ready> ");
				2077	getNextToken();
				2078
				2079	// Make the module, which holds all the code.
				2080	TheModule = new Module("my cool jit");
				2081
				2082	// Create the JIT.
				2083	TheExecutionEngine = ExecutionEngine::create(TheModule);
				2084
				2085	{
				2086	ExistingModuleProvider OurModuleProvider(TheModule);
				2087	FunctionPassManager OurFPM(&OurModuleProvider);
				2088
				2089	// Set up the optimizer pipeline. Start with registering info about how the
				2090	// target lays out data structures.
				2091	OurFPM.add(new TargetData(*TheExecutionEngine->getTargetData()));
				2092	// Promote allocas to registers.
				2093	OurFPM.add(createPromoteMemoryToRegisterPass());
				2094	// Do simple "peephole" optimizations and bit-twiddling optzns.
				2095	OurFPM.add(createInstructionCombiningPass());
				2096	// Reassociate expressions.
				2097	OurFPM.add(createReassociatePass());
				2098	// Eliminate Common SubExpressions.
				2099	OurFPM.add(createGVNPass());
				2100	// Simplify the control flow graph (deleting unreachable blocks, etc).
				2101	OurFPM.add(createCFGSimplificationPass());
				2102
				2103	// Set the global so the code gen can use this.
				2104	TheFPM = &OurFPM;
				2105
				2106	// Run the main "interpreter loop" now.
				2107	MainLoop();
				2108
				2109	TheFPM = 0;
				2110	} // Free module provider and pass manager.
				2111
				2112
				2113	// Print out all of the generated code.
				2114	TheModule->dump();
				2115	return 0;
				2116	}
Chris Lattner	00c992d	2007-11-03 08:55:29 +0000	[diff] [blame]	2117	</pre>
				2118	</div>
				2119
				2120	</div>
				2121
				2122	<!-- *********************************************************************** -->
				2123	<hr>
				2124	<address>
				2125	<a href="http://jigsaw.w3.org/css-validator/check/referer"><img
				2126	src="http://jigsaw.w3.org/css-validator/images/vcss" alt="Valid CSS!"></a>
				2127	<a href="http://validator.w3.org/check/referer"><img
				2128	src="http://www.w3.org/Icons/valid-html401" alt="Valid HTML 4.01!"></a>
				2129
				2130	<a href="mailto:sabre@nondot.org">Chris Lattner</a><br>
				2131	<a href="http://llvm.org">The LLVM Compiler Infrastructure</a><br>
				2132	Last modified: $Date: 2007-10-17 11:05:13 -0700 (Wed, 17 Oct 2007) $
				2133	</address>
				2134	</body>
				2135	</html>