Blame - docs/tutorial/OCamlLangImpl7.rst - fp2-dev/platform/external/llvm

blob: 07da3a8ff963f8125064a711dea742d7bfc89c05 [file] [log] [blame]

Sean Silva	ee47edf	2012-12-05 00:26:32 +0000	[diff] [blame]	1	=======================================================
				2	Kaleidoscope: Extending the Language: Mutable Variables
				3	=======================================================
				4
				5	.. contents::
				6	:local:
				7
				8	Written by `Chris Lattner <mailto:sabre@nondot.org>`_ and `Erick
				9	Tryzelaar <mailto:idadesub@users.sourceforge.net>`_
				10
				11	Chapter 7 Introduction
				12	======================
				13
				14	Welcome to Chapter 7 of the "`Implementing a language with
				15	LLVM <index.html>`_" tutorial. In chapters 1 through 6, we've built a
				16	very respectable, albeit simple, `functional programming
				17	language <http://en.wikipedia.org/wiki/Functional_programming>`_. In our
				18	journey, we learned some parsing techniques, how to build and represent
				19	an AST, how to build LLVM IR, and how to optimize the resultant code as
				20	well as JIT compile it.
				21
				22	While Kaleidoscope is interesting as a functional language, the fact
				23	that it is functional makes it "too easy" to generate LLVM IR for it. In
				24	particular, a functional language makes it very easy to build LLVM IR
				25	directly in `SSA
				26	form <http://en.wikipedia.org/wiki/Static_single_assignment_form>`_.
				27	Since LLVM requires that the input code be in SSA form, this is a very
				28	nice property and it is often unclear to newcomers how to generate code
				29	for an imperative language with mutable variables.
				30
				31	The short (and happy) summary of this chapter is that there is no need
				32	for your front-end to build SSA form: LLVM provides highly tuned and
				33	well tested support for this, though the way it works is a bit
				34	unexpected for some.
				35
				36	Why is this a hard problem?
				37	===========================
				38
				39	To understand why mutable variables cause complexities in SSA
				40	construction, consider this extremely simple C example:
				41
				42	.. code-block:: c
				43
				44	int G, H;
				45	int test(_Bool Condition) {
				46	int X;
				47	if (Condition)
				48	X = G;
				49	else
				50	X = H;
				51	return X;
				52	}
				53
				54	In this case, we have the variable "X", whose value depends on the path
				55	executed in the program. Because there are two different possible values
				56	for X before the return instruction, a PHI node is inserted to merge the
				57	two values. The LLVM IR that we want for this example looks like this:
				58
				59	.. code-block:: llvm
				60
				61	@G = weak global i32 0 ; type of @G is i32*
				62	@H = weak global i32 0 ; type of @H is i32*
				63
				64	define i32 @test(i1 %Condition) {
				65	entry:
				66	br i1 %Condition, label %cond_true, label %cond_false
				67
				68	cond_true:
				69	%X.0 = load i32* @G
				70	br label %cond_next
				71
				72	cond_false:
				73	%X.1 = load i32* @H
				74	br label %cond_next
				75
				76	cond_next:
				77	%X.2 = phi i32 [ %X.1, %cond_false ], [ %X.0, %cond_true ]
				78	ret i32 %X.2
				79	}
				80
				81	In this example, the loads from the G and H global variables are
				82	explicit in the LLVM IR, and they live in the then/else branches of the
				83	if statement (cond\_true/cond\_false). In order to merge the incoming
				84	values, the X.2 phi node in the cond\_next block selects the right value
				85	to use based on where control flow is coming from: if control flow comes
				86	from the cond\_false block, X.2 gets the value of X.1. Alternatively, if
				87	control flow comes from cond\_true, it gets the value of X.0. The intent
				88	of this chapter is not to explain the details of SSA form. For more
				89	information, see one of the many `online
				90	references <http://en.wikipedia.org/wiki/Static_single_assignment_form>`_.
				91
				92	The question for this article is "who places the phi nodes when lowering
				93	assignments to mutable variables?". The issue here is that LLVM
				94	requires that its IR be in SSA form: there is no "non-ssa" mode for
				95	it. However, SSA construction requires non-trivial algorithms and data
				96	structures, so it is inconvenient and wasteful for every front-end to
				97	have to reproduce this logic.
				98
				99	Memory in LLVM
				100	==============
				101
				102	The 'trick' here is that while LLVM does require all register values to
				103	be in SSA form, it does not require (or permit) memory objects to be in
				104	SSA form. In the example above, note that the loads from G and H are
				105	direct accesses to G and H: they are not renamed or versioned. This
				106	differs from some other compiler systems, which do try to version memory
				107	objects. In LLVM, instead of encoding dataflow analysis of memory into
				108	the LLVM IR, it is handled with `Analysis
				109	Passes <../WritingAnLLVMPass.html>`_ which are computed on demand.
				110
				111	With this in mind, the high-level idea is that we want to make a stack
				112	variable (which lives in memory, because it is on the stack) for each
				113	mutable object in a function. To take advantage of this trick, we need
				114	to talk about how LLVM represents stack variables.
				115
				116	In LLVM, all memory accesses are explicit with load/store instructions,
				117	and it is carefully designed not to have (or need) an "address-of"
				118	operator. Notice how the type of the @G/@H global variables is actually
				119	"i32\*" even though the variable is defined as "i32". What this means is
				120	that @G defines space for an i32 in the global data area, but its
				121	name actually refers to the address for that space. Stack variables
				122	work the same way, except that instead of being declared with global
				123	variable definitions, they are declared with the `LLVM alloca
				124	instruction <../LangRef.html#i_alloca>`_:
				125
				126	.. code-block:: llvm
				127
				128	define i32 @example() {
				129	entry:
				130	%X = alloca i32 ; type of %X is i32*.
				131	...
				132	%tmp = load i32* %X ; load the stack value %X from the stack.
				133	%tmp2 = add i32 %tmp, 1 ; increment it
				134	store i32 %tmp2, i32* %X ; store it back
				135	...
				136
				137	This code shows an example of how you can declare and manipulate a stack
				138	variable in the LLVM IR. Stack memory allocated with the alloca
				139	instruction is fully general: you can pass the address of the stack slot
				140	to functions, you can store it in other variables, etc. In our example
				141	above, we could rewrite the example to use the alloca technique to avoid
				142	using a PHI node:
				143
				144	.. code-block:: llvm
				145
				146	@G = weak global i32 0 ; type of @G is i32*
				147	@H = weak global i32 0 ; type of @H is i32*
				148
				149	define i32 @test(i1 %Condition) {
				150	entry:
				151	%X = alloca i32 ; type of %X is i32*.
				152	br i1 %Condition, label %cond_true, label %cond_false
				153
				154	cond_true:
				155	%X.0 = load i32* @G
				156	store i32 %X.0, i32* %X ; Update X
				157	br label %cond_next
				158
				159	cond_false:
				160	%X.1 = load i32* @H
				161	store i32 %X.1, i32* %X ; Update X
				162	br label %cond_next
				163
				164	cond_next:
				165	%X.2 = load i32* %X ; Read X
				166	ret i32 %X.2
				167	}
				168
				169	With this, we have discovered a way to handle arbitrary mutable
				170	variables without the need to create Phi nodes at all:
				171
				172	#. Each mutable variable becomes a stack allocation.
				173	#. Each read of the variable becomes a load from the stack.
				174	#. Each update of the variable becomes a store to the stack.
				175	#. Taking the address of a variable just uses the stack address
				176	directly.
				177
				178	While this solution has solved our immediate problem, it introduced
				179	another one: we have now apparently introduced a lot of stack traffic
				180	for very simple and common operations, a major performance problem.
				181	Fortunately for us, the LLVM optimizer has a highly-tuned optimization
				182	pass named "mem2reg" that handles this case, promoting allocas like this
				183	into SSA registers, inserting Phi nodes as appropriate. If you run this
				184	example through the pass, for example, you'll get:
				185
				186	.. code-block:: bash
				187
				188	$ llvm-as < example.ll \| opt -mem2reg \| llvm-dis
				189	@G = weak global i32 0
				190	@H = weak global i32 0
				191
				192	define i32 @test(i1 %Condition) {
				193	entry:
				194	br i1 %Condition, label %cond_true, label %cond_false
				195
				196	cond_true:
				197	%X.0 = load i32* @G
				198	br label %cond_next
				199
				200	cond_false:
				201	%X.1 = load i32* @H
				202	br label %cond_next
				203
				204	cond_next:
				205	%X.01 = phi i32 [ %X.1, %cond_false ], [ %X.0, %cond_true ]
				206	ret i32 %X.01
				207	}
				208
				209	The mem2reg pass implements the standard "iterated dominance frontier"
				210	algorithm for constructing SSA form and has a number of optimizations
				211	that speed up (very common) degenerate cases. The mem2reg optimization
				212	pass is the answer to dealing with mutable variables, and we highly
				213	recommend that you depend on it. Note that mem2reg only works on
				214	variables in certain circumstances:
				215
				216	#. mem2reg is alloca-driven: it looks for allocas and if it can handle
				217	them, it promotes them. It does not apply to global variables or heap
				218	allocations.
				219	#. mem2reg only looks for alloca instructions in the entry block of the
				220	function. Being in the entry block guarantees that the alloca is only
				221	executed once, which makes analysis simpler.
				222	#. mem2reg only promotes allocas whose uses are direct loads and stores.
				223	If the address of the stack object is passed to a function, or if any
				224	funny pointer arithmetic is involved, the alloca will not be
				225	promoted.
				226	#. mem2reg only works on allocas of `first
				227	class <../LangRef.html#t_classifications>`_ values (such as pointers,
				228	scalars and vectors), and only if the array size of the allocation is
				229	1 (or missing in the .ll file). mem2reg is not capable of promoting
				230	structs or arrays to registers. Note that the "scalarrepl" pass is
				231	more powerful and can promote structs, "unions", and arrays in many
				232	cases.
				233
				234	All of these properties are easy to satisfy for most imperative
				235	languages, and we'll illustrate it below with Kaleidoscope. The final
				236	question you may be asking is: should I bother with this nonsense for my
				237	front-end? Wouldn't it be better if I just did SSA construction
				238	directly, avoiding use of the mem2reg optimization pass? In short, we
				239	strongly recommend that you use this technique for building SSA form,
				240	unless there is an extremely good reason not to. Using this technique
				241	is:
				242
				243	- Proven and well tested: llvm-gcc and clang both use this technique
				244	for local mutable variables. As such, the most common clients of LLVM
				245	are using this to handle a bulk of their variables. You can be sure
				246	that bugs are found fast and fixed early.
				247	- Extremely Fast: mem2reg has a number of special cases that make it
				248	fast in common cases as well as fully general. For example, it has
				249	fast-paths for variables that are only used in a single block,
				250	variables that only have one assignment point, good heuristics to
				251	avoid insertion of unneeded phi nodes, etc.
				252	- Needed for debug info generation: `Debug information in
				253	LLVM <../SourceLevelDebugging.html>`_ relies on having the address of
				254	the variable exposed so that debug info can be attached to it. This
				255	technique dovetails very naturally with this style of debug info.
				256
				257	If nothing else, this makes it much easier to get your front-end up and
				258	running, and is very simple to implement. Lets extend Kaleidoscope with
				259	mutable variables now!
				260
				261	Mutable Variables in Kaleidoscope
				262	=================================
				263
				264	Now that we know the sort of problem we want to tackle, lets see what
				265	this looks like in the context of our little Kaleidoscope language.
				266	We're going to add two features:
				267
				268	#. The ability to mutate variables with the '=' operator.
				269	#. The ability to define new variables.
				270
				271	While the first item is really what this is about, we only have
				272	variables for incoming arguments as well as for induction variables, and
				273	redefining those only goes so far :). Also, the ability to define new
				274	variables is a useful thing regardless of whether you will be mutating
				275	them. Here's a motivating example that shows how we could use these:
				276
				277	::
				278
				279	# Define ':' for sequencing: as a low-precedence operator that ignores operands
				280	# and just returns the RHS.
				281	def binary : 1 (x y) y;
				282
				283	# Recursive fib, we could do this before.
				284	def fib(x)
				285	if (x < 3) then
				286	1
				287	else
				288	fib(x-1)+fib(x-2);
				289
				290	# Iterative fib.
				291	def fibi(x)
				292	var a = 1, b = 1, c in
				293	(for i = 3, i < x in
				294	c = a + b :
				295	a = b :
				296	b = c) :
				297	b;
				298
				299	# Call it.
				300	fibi(10);
				301
				302	In order to mutate variables, we have to change our existing variables
				303	to use the "alloca trick". Once we have that, we'll add our new
				304	operator, then extend Kaleidoscope to support new variable definitions.
				305
				306	Adjusting Existing Variables for Mutation
				307	=========================================
				308
				309	The symbol table in Kaleidoscope is managed at code generation time by
				310	the '``named_values``' map. This map currently keeps track of the LLVM
				311	"Value\*" that holds the double value for the named variable. In order
				312	to support mutation, we need to change this slightly, so that it
				313	``named_values`` holds the memory location of the variable in
				314	question. Note that this change is a refactoring: it changes the
				315	structure of the code, but does not (by itself) change the behavior of
				316	the compiler. All of these changes are isolated in the Kaleidoscope code
				317	generator.
				318
				319	At this point in Kaleidoscope's development, it only supports variables
				320	for two things: incoming arguments to functions and the induction
				321	variable of 'for' loops. For consistency, we'll allow mutation of these
				322	variables in addition to other user-defined variables. This means that
				323	these will both need memory locations.
				324
				325	To start our transformation of Kaleidoscope, we'll change the
				326	``named_values`` map so that it maps to AllocaInst\* instead of Value\*.
				327	Once we do this, the C++ compiler will tell us what parts of the code we
				328	need to update:
				329
				330	Note: the ocaml bindings currently model both ``Value*``'s and
				331	``AllocInst*``'s as ``Llvm.llvalue``'s, but this may change in the future
				332	to be more type safe.
				333
				334	.. code-block:: ocaml
				335
				336	let named_values:(string, llvalue) Hashtbl.t = Hashtbl.create 10
				337
				338	Also, since we will need to create these alloca's, we'll use a helper
				339	function that ensures that the allocas are created in the entry block of
				340	the function:
				341
				342	.. code-block:: ocaml
				343
				344	(* Create an alloca instruction in the entry block of the function. This
				345	* is used for mutable variables etc. *)
				346	let create_entry_block_alloca the_function var_name =
				347	let builder = builder_at (instr_begin (entry_block the_function)) in
				348	build_alloca double_type var_name builder
				349
				350	This funny looking code creates an ``Llvm.llbuilder`` object that is
				351	pointing at the first instruction of the entry block. It then creates an
				352	alloca with the expected name and returns it. Because all values in
				353	Kaleidoscope are doubles, there is no need to pass in a type to use.
				354
				355	With this in place, the first functionality change we want to make is to
				356	variable references. In our new scheme, variables live on the stack, so
				357	code generating a reference to them actually needs to produce a load
				358	from the stack slot:
				359
				360	.. code-block:: ocaml
				361
				362	let rec codegen_expr = function
				363	...
				364	\| Ast.Variable name ->
				365	let v = try Hashtbl.find named_values name with
				366	\| Not_found -> raise (Error "unknown variable name")
				367	in
				368	(* Load the value. *)
				369	build_load v name builder
				370
				371	As you can see, this is pretty straightforward. Now we need to update
				372	the things that define the variables to set up the alloca. We'll start
				373	with ``codegen_expr Ast.For ...`` (see the `full code listing <#code>`_
				374	for the unabridged code):
				375
				376	.. code-block:: ocaml
				377
				378	\| Ast.For (var_name, start, end_, step, body) ->
				379	let the_function = block_parent (insertion_block builder) in
				380
				381	(* Create an alloca for the variable in the entry block. *)
				382	let alloca = create_entry_block_alloca the_function var_name in
				383
				384	(* Emit the start code first, without 'variable' in scope. *)
				385	let start_val = codegen_expr start in
				386
				387	(* Store the value into the alloca. *)
				388	ignore(build_store start_val alloca builder);
				389
				390	...
				391
				392	(* Within the loop, the variable is defined equal to the PHI node. If it
				393	* shadows an existing variable, we have to restore it, so save it
				394	* now. *)
				395	let old_val =
				396	try Some (Hashtbl.find named_values var_name) with Not_found -> None
				397	in
				398	Hashtbl.add named_values var_name alloca;
				399
				400	...
				401
				402	(* Compute the end condition. *)
				403	let end_cond = codegen_expr end_ in
				404
				405	(* Reload, increment, and restore the alloca. This handles the case where
				406	* the body of the loop mutates the variable. *)
				407	let cur_var = build_load alloca var_name builder in
				408	let next_var = build_add cur_var step_val "nextvar" builder in
				409	ignore(build_store next_var alloca builder);
				410	...
				411
				412	This code is virtually identical to the code `before we allowed mutable
				413	variables <OCamlLangImpl5.html#forcodegen>`_. The big difference is that
				414	we no longer have to construct a PHI node, and we use load/store to
				415	access the variable as needed.
				416
				417	To support mutable argument variables, we need to also make allocas for
				418	them. The code for this is also pretty simple:
				419
				420	.. code-block:: ocaml
				421
				422	(* Create an alloca for each argument and register the argument in the symbol
				423	* table so that references to it will succeed. *)
				424	let create_argument_allocas the_function proto =
				425	let args = match proto with
				426	\| Ast.Prototype (_, args) \| Ast.BinOpPrototype (_, args, _) -> args
				427	in
				428	Array.iteri (fun i ai ->
				429	let var_name = args.(i) in
				430	(* Create an alloca for this variable. *)
				431	let alloca = create_entry_block_alloca the_function var_name in
				432
				433	(* Store the initial value into the alloca. *)
				434	ignore(build_store ai alloca builder);
				435
				436	(* Add arguments to variable symbol table. *)
				437	Hashtbl.add named_values var_name alloca;
				438	) (params the_function)
				439
				440	For each argument, we make an alloca, store the input value to the
				441	function into the alloca, and register the alloca as the memory location
				442	for the argument. This method gets invoked by ``Codegen.codegen_func``
				443	right after it sets up the entry block for the function.
				444
				445	The final missing piece is adding the mem2reg pass, which allows us to
				446	get good codegen once again:
				447
				448	.. code-block:: ocaml
				449
				450	let main () =
				451	...
				452	let the_fpm = PassManager.create_function Codegen.the_module in
				453
				454	(* Set up the optimizer pipeline. Start with registering info about how the
				455	* target lays out data structures. *)
				456	DataLayout.add (ExecutionEngine.target_data the_execution_engine) the_fpm;
				457
				458	(* Promote allocas to registers. *)
				459	add_memory_to_register_promotion the_fpm;
				460
				461	(* Do simple "peephole" optimizations and bit-twiddling optzn. *)
				462	add_instruction_combining the_fpm;
				463
				464	(* reassociate expressions. *)
				465	add_reassociation the_fpm;
				466
				467	It is interesting to see what the code looks like before and after the
				468	mem2reg optimization runs. For example, this is the before/after code
				469	for our recursive fib function. Before the optimization:
				470
				471	.. code-block:: llvm
				472
				473	define double @fib(double %x) {
				474	entry:
				475	%x1 = alloca double
				476	store double %x, double* %x1
				477	%x2 = load double* %x1
				478	%cmptmp = fcmp ult double %x2, 3.000000e+00
				479	%booltmp = uitofp i1 %cmptmp to double
				480	%ifcond = fcmp one double %booltmp, 0.000000e+00
				481	br i1 %ifcond, label %then, label %else
				482
				483	then: ; preds = %entry
				484	br label %ifcont
				485
				486	else: ; preds = %entry
				487	%x3 = load double* %x1
				488	%subtmp = fsub double %x3, 1.000000e+00
				489	%calltmp = call double @fib(double %subtmp)
				490	%x4 = load double* %x1
				491	%subtmp5 = fsub double %x4, 2.000000e+00
				492	%calltmp6 = call double @fib(double %subtmp5)
				493	%addtmp = fadd double %calltmp, %calltmp6
				494	br label %ifcont
				495
				496	ifcont: ; preds = %else, %then
				497	%iftmp = phi double [ 1.000000e+00, %then ], [ %addtmp, %else ]
				498	ret double %iftmp
				499	}
				500
				501	Here there is only one variable (x, the input argument) but you can
				502	still see the extremely simple-minded code generation strategy we are
				503	using. In the entry block, an alloca is created, and the initial input
				504	value is stored into it. Each reference to the variable does a reload
				505	from the stack. Also, note that we didn't modify the if/then/else
				506	expression, so it still inserts a PHI node. While we could make an
				507	alloca for it, it is actually easier to create a PHI node for it, so we
				508	still just make the PHI.
				509
				510	Here is the code after the mem2reg pass runs:
				511
				512	.. code-block:: llvm
				513
				514	define double @fib(double %x) {
				515	entry:
				516	%cmptmp = fcmp ult double %x, 3.000000e+00
				517	%booltmp = uitofp i1 %cmptmp to double
				518	%ifcond = fcmp one double %booltmp, 0.000000e+00
				519	br i1 %ifcond, label %then, label %else
				520
				521	then:
				522	br label %ifcont
				523
				524	else:
				525	%subtmp = fsub double %x, 1.000000e+00
				526	%calltmp = call double @fib(double %subtmp)
				527	%subtmp5 = fsub double %x, 2.000000e+00
				528	%calltmp6 = call double @fib(double %subtmp5)
				529	%addtmp = fadd double %calltmp, %calltmp6
				530	br label %ifcont
				531
				532	ifcont: ; preds = %else, %then
				533	%iftmp = phi double [ 1.000000e+00, %then ], [ %addtmp, %else ]
				534	ret double %iftmp
				535	}
				536
				537	This is a trivial case for mem2reg, since there are no redefinitions of
				538	the variable. The point of showing this is to calm your tension about
				539	inserting such blatent inefficiencies :).
				540
				541	After the rest of the optimizers run, we get:
				542
				543	.. code-block:: llvm
				544
				545	define double @fib(double %x) {
				546	entry:
				547	%cmptmp = fcmp ult double %x, 3.000000e+00
				548	%booltmp = uitofp i1 %cmptmp to double
				549	%ifcond = fcmp ueq double %booltmp, 0.000000e+00
				550	br i1 %ifcond, label %else, label %ifcont
				551
				552	else:
				553	%subtmp = fsub double %x, 1.000000e+00
				554	%calltmp = call double @fib(double %subtmp)
				555	%subtmp5 = fsub double %x, 2.000000e+00
				556	%calltmp6 = call double @fib(double %subtmp5)
				557	%addtmp = fadd double %calltmp, %calltmp6
				558	ret double %addtmp
				559
				560	ifcont:
				561	ret double 1.000000e+00
				562	}
				563
				564	Here we see that the simplifycfg pass decided to clone the return
				565	instruction into the end of the 'else' block. This allowed it to
				566	eliminate some branches and the PHI node.
				567
				568	Now that all symbol table references are updated to use stack variables,
				569	we'll add the assignment operator.
				570
				571	New Assignment Operator
				572	=======================
				573
				574	With our current framework, adding a new assignment operator is really
				575	simple. We will parse it just like any other binary operator, but handle
				576	it internally (instead of allowing the user to define it). The first
				577	step is to set a precedence:
				578
				579	.. code-block:: ocaml
				580
				581	let main () =
				582	(* Install standard binary operators.
				583	* 1 is the lowest precedence. *)
				584	Hashtbl.add Parser.binop_precedence '=' 2;
				585	Hashtbl.add Parser.binop_precedence '<' 10;
				586	Hashtbl.add Parser.binop_precedence '+' 20;
				587	Hashtbl.add Parser.binop_precedence '-' 20;
				588	...
				589
				590	Now that the parser knows the precedence of the binary operator, it
				591	takes care of all the parsing and AST generation. We just need to
				592	implement codegen for the assignment operator. This looks like:
				593
				594	.. code-block:: ocaml
				595
				596	let rec codegen_expr = function
				597	begin match op with
				598	\| '=' ->
				599	(* Special case '=' because we don't want to emit the LHS as an
				600	* expression. *)
				601	let name =
				602	match lhs with
				603	\| Ast.Variable name -> name
				604	\| _ -> raise (Error "destination of '=' must be a variable")
				605	in
				606
				607	Unlike the rest of the binary operators, our assignment operator doesn't
				608	follow the "emit LHS, emit RHS, do computation" model. As such, it is
				609	handled as a special case before the other binary operators are handled.
				610	The other strange thing is that it requires the LHS to be a variable. It
				611	is invalid to have "(x+1) = expr" - only things like "x = expr" are
				612	allowed.
				613
				614	.. code-block:: ocaml
				615
				616	(* Codegen the rhs. *)
				617	let val_ = codegen_expr rhs in
				618
				619	(* Lookup the name. *)
				620	let variable = try Hashtbl.find named_values name with
				621	\| Not_found -> raise (Error "unknown variable name")
				622	in
				623	ignore(build_store val_ variable builder);
				624	val_
				625	\| _ ->
				626	...
				627
				628	Once we have the variable, codegen'ing the assignment is
				629	straightforward: we emit the RHS of the assignment, create a store, and
				630	return the computed value. Returning a value allows for chained
				631	assignments like "X = (Y = Z)".
				632
				633	Now that we have an assignment operator, we can mutate loop variables
				634	and arguments. For example, we can now run code like this:
				635
				636	::
				637
				638	# Function to print a double.
				639	extern printd(x);
				640
				641	# Define ':' for sequencing: as a low-precedence operator that ignores operands
				642	# and just returns the RHS.
				643	def binary : 1 (x y) y;
				644
				645	def test(x)
				646	printd(x) :
				647	x = 4 :
				648	printd(x);
				649
				650	test(123);
				651
				652	When run, this example prints "123" and then "4", showing that we did
				653	actually mutate the value! Okay, we have now officially implemented our
				654	goal: getting this to work requires SSA construction in the general
				655	case. However, to be really useful, we want the ability to define our
				656	own local variables, lets add this next!
				657
				658	User-defined Local Variables
				659	============================
				660
				661	Adding var/in is just like any other other extensions we made to
				662	Kaleidoscope: we extend the lexer, the parser, the AST and the code
				663	generator. The first step for adding our new 'var/in' construct is to
				664	extend the lexer. As before, this is pretty trivial, the code looks like
				665	this:
				666
				667	.. code-block:: ocaml
				668
				669	type token =
				670	...
				671	(* var definition *)
				672	\| Var
				673
				674	...
				675
				676	and lex_ident buffer = parser
				677	...
				678	\| "in" -> [< 'Token.In; stream >]
				679	\| "binary" -> [< 'Token.Binary; stream >]
				680	\| "unary" -> [< 'Token.Unary; stream >]
				681	\| "var" -> [< 'Token.Var; stream >]
				682	...
				683
				684	The next step is to define the AST node that we will construct. For
				685	var/in, it looks like this:
				686
				687	.. code-block:: ocaml
				688
				689	type expr =
				690	...
				691	(* variant for var/in. *)
				692	\| Var of (string * expr option) array * expr
				693	...
				694
				695	var/in allows a list of names to be defined all at once, and each name
				696	can optionally have an initializer value. As such, we capture this
				697	information in the VarNames vector. Also, var/in has a body, this body
				698	is allowed to access the variables defined by the var/in.
				699
				700	With this in place, we can define the parser pieces. The first thing we
				701	do is add it as a primary expression:
				702
				703	.. code-block:: ocaml
				704
				705	(* primary
				706	* ::= identifier
				707	* ::= numberexpr
				708	* ::= parenexpr
				709	* ::= ifexpr
				710	* ::= forexpr
				711	* ::= varexpr *)
				712	let rec parse_primary = parser
				713	...
				714	(* varexpr
				715	* ::= 'var' identifier ('=' expression?
				716	* (',' identifier ('=' expression)?)* 'in' expression *)
				717	\| [< 'Token.Var;
				718	(* At least one variable name is required. *)
				719	'Token.Ident id ?? "expected identifier after var";
				720	init=parse_var_init;
				721	var_names=parse_var_names [(id, init)];
				722	(* At this point, we have to have 'in'. *)
				723	'Token.In ?? "expected 'in' keyword after 'var'";
				724	body=parse_expr >] ->
				725	Ast.Var (Array.of_list (List.rev var_names), body)
				726
				727	...
				728
				729	and parse_var_init = parser
				730	(* read in the optional initializer. *)
				731	\| [< 'Token.Kwd '='; e=parse_expr >] -> Some e
				732	\| [< >] -> None
				733
				734	and parse_var_names accumulator = parser
				735	\| [< 'Token.Kwd ',';
				736	'Token.Ident id ?? "expected identifier list after var";
				737	init=parse_var_init;
				738	e=parse_var_names ((id, init) :: accumulator) >] -> e
				739	\| [< >] -> accumulator
				740
				741	Now that we can parse and represent the code, we need to support
				742	emission of LLVM IR for it. This code starts out with:
				743
				744	.. code-block:: ocaml
				745
				746	let rec codegen_expr = function
				747	...
				748	\| Ast.Var (var_names, body)
				749	let old_bindings = ref [] in
				750
				751	let the_function = block_parent (insertion_block builder) in
				752
				753	(* Register all variables and emit their initializer. *)
				754	Array.iter (fun (var_name, init) ->
				755
				756	Basically it loops over all the variables, installing them one at a
				757	time. For each variable we put into the symbol table, we remember the
				758	previous value that we replace in OldBindings.
				759
				760	.. code-block:: ocaml
				761
				762	(* Emit the initializer before adding the variable to scope, this
				763	* prevents the initializer from referencing the variable itself, and
				764	* permits stuff like this:
				765	* var a = 1 in
				766	* var a = a in ... # refers to outer 'a'. *)
				767	let init_val =
				768	match init with
				769	\| Some init -> codegen_expr init
				770	(* If not specified, use 0.0. *)
				771	\| None -> const_float double_type 0.0
				772	in
				773
				774	let alloca = create_entry_block_alloca the_function var_name in
				775	ignore(build_store init_val alloca builder);
				776
				777	(* Remember the old variable binding so that we can restore the binding
				778	* when we unrecurse. *)
				779
				780	begin
				781	try
				782	let old_value = Hashtbl.find named_values var_name in
				783	old_bindings := (var_name, old_value) :: !old_bindings;
				784	with Not_found > ()
				785	end;
				786
				787	(* Remember this binding. *)
				788	Hashtbl.add named_values var_name alloca;
				789	) var_names;
				790
				791	There are more comments here than code. The basic idea is that we emit
				792	the initializer, create the alloca, then update the symbol table to
				793	point to it. Once all the variables are installed in the symbol table,
				794	we evaluate the body of the var/in expression:
				795
				796	.. code-block:: ocaml
				797
				798	(* Codegen the body, now that all vars are in scope. *)
				799	let body_val = codegen_expr body in
				800
				801	Finally, before returning, we restore the previous variable bindings:
				802
				803	.. code-block:: ocaml
				804
				805	(* Pop all our variables from scope. *)
				806	List.iter (fun (var_name, old_value) ->
				807	Hashtbl.add named_values var_name old_value
				808	) !old_bindings;
				809
				810	(* Return the body computation. *)
				811	body_val
				812
				813	The end result of all of this is that we get properly scoped variable
				814	definitions, and we even (trivially) allow mutation of them :).
				815
				816	With this, we completed what we set out to do. Our nice iterative fib
				817	example from the intro compiles and runs just fine. The mem2reg pass
				818	optimizes all of our stack variables into SSA registers, inserting PHI
				819	nodes where needed, and our front-end remains simple: no "iterated
				820	dominance frontier" computation anywhere in sight.
				821
				822	Full Code Listing
				823	=================
				824
				825	Here is the complete code listing for our running example, enhanced with
				826	mutable variables and var/in support. To build this example, use:
				827
				828	.. code-block:: bash
				829
				830	# Compile
				831	ocamlbuild toy.byte
				832	# Run
				833	./toy.byte
				834
				835	Here is the code:
				836
				837	\_tags:
				838	::
				839
				840	<{lexer,parser}.ml>: use_camlp4, pp(camlp4of)
				841	<*.{byte,native}>: g++, use_llvm, use_llvm_analysis
				842	<*.{byte,native}>: use_llvm_executionengine, use_llvm_target
				843	<*.{byte,native}>: use_llvm_scalar_opts, use_bindings
				844
				845	myocamlbuild.ml:
				846	.. code-block:: ocaml
				847
				848	open Ocamlbuild_plugin;;
				849
				850	ocaml_lib ~extern:true "llvm";;
				851	ocaml_lib ~extern:true "llvm_analysis";;
				852	ocaml_lib ~extern:true "llvm_executionengine";;
				853	ocaml_lib ~extern:true "llvm_target";;
				854	ocaml_lib ~extern:true "llvm_scalar_opts";;
				855
				856	flag ["link"; "ocaml"; "g++"] (S[A"-cc"; A"g++"; A"-cclib"; A"-rdynamic"]);;
				857	dep ["link"; "ocaml"; "use_bindings"] ["bindings.o"];;
				858
				859	token.ml:
				860	.. code-block:: ocaml
				861
				862	(*===----------------------------------------------------------------------===
				863	* Lexer Tokens
				864	===----------------------------------------------------------------------===)
				865
				866	(* The lexer returns these 'Kwd' if it is an unknown character, otherwise one of
				867	* these others for known things. *)
				868	type token =
				869	(* commands *)
				870	\| Def \| Extern
				871
				872	(* primary *)
				873	\| Ident of string \| Number of float
				874
				875	(* unknown *)
				876	\| Kwd of char
				877
				878	(* control *)
				879	\| If \| Then \| Else
				880	\| For \| In
				881
				882	(* operators *)
				883	\| Binary \| Unary
				884
				885	(* var definition *)
				886	\| Var
				887
				888	lexer.ml:
				889	.. code-block:: ocaml
				890
				891	(*===----------------------------------------------------------------------===
				892	* Lexer
				893	===----------------------------------------------------------------------===)
				894
				895	let rec lex = parser
				896	(* Skip any whitespace. *)
				897	\| [< ' (' ' \| '\n' \| '\r' \| '\t'); stream >] -> lex stream
				898
				899	(* identifier: [a-zA-Z][a-zA-Z0-9] *)
				900	\| [< ' ('A' .. 'Z' \| 'a' .. 'z' as c); stream >] ->
				901	let buffer = Buffer.create 1 in
				902	Buffer.add_char buffer c;
				903	lex_ident buffer stream
				904
				905	(* number: [0-9.]+ *)
				906	\| [< ' ('0' .. '9' as c); stream >] ->
				907	let buffer = Buffer.create 1 in
				908	Buffer.add_char buffer c;
				909	lex_number buffer stream
				910
				911	(* Comment until end of line. *)
				912	\| [< ' ('#'); stream >] ->
				913	lex_comment stream
				914
				915	(* Otherwise, just return the character as its ascii value. *)
				916	\| [< 'c; stream >] ->
				917	[< 'Token.Kwd c; lex stream >]
				918
				919	(* end of stream. *)
				920	\| [< >] -> [< >]
				921
				922	and lex_number buffer = parser
				923	\| [< ' ('0' .. '9' \| '.' as c); stream >] ->
				924	Buffer.add_char buffer c;
				925	lex_number buffer stream
				926	\| [< stream=lex >] ->
				927	[< 'Token.Number (float_of_string (Buffer.contents buffer)); stream >]
				928
				929	and lex_ident buffer = parser
				930	\| [< ' ('A' .. 'Z' \| 'a' .. 'z' \| '0' .. '9' as c); stream >] ->
				931	Buffer.add_char buffer c;
				932	lex_ident buffer stream
				933	\| [< stream=lex >] ->
				934	match Buffer.contents buffer with
				935	\| "def" -> [< 'Token.Def; stream >]
				936	\| "extern" -> [< 'Token.Extern; stream >]
				937	\| "if" -> [< 'Token.If; stream >]
				938	\| "then" -> [< 'Token.Then; stream >]
				939	\| "else" -> [< 'Token.Else; stream >]
				940	\| "for" -> [< 'Token.For; stream >]
				941	\| "in" -> [< 'Token.In; stream >]
				942	\| "binary" -> [< 'Token.Binary; stream >]
				943	\| "unary" -> [< 'Token.Unary; stream >]
				944	\| "var" -> [< 'Token.Var; stream >]
				945	\| id -> [< 'Token.Ident id; stream >]
				946
				947	and lex_comment = parser
				948	\| [< ' ('\n'); stream=lex >] -> stream
				949	\| [< 'c; e=lex_comment >] -> e
				950	\| [< >] -> [< >]
				951
				952	ast.ml:
				953	.. code-block:: ocaml
				954
				955	(*===----------------------------------------------------------------------===
				956	* Abstract Syntax Tree (aka Parse Tree)
				957	===----------------------------------------------------------------------===)
				958
				959	(* expr - Base type for all expression nodes. *)
				960	type expr =
				961	(* variant for numeric literals like "1.0". *)
				962	\| Number of float
				963
				964	(* variant for referencing a variable, like "a". *)
				965	\| Variable of string
				966
				967	(* variant for a unary operator. *)
				968	\| Unary of char * expr
				969
				970	(* variant for a binary operator. *)
				971	\| Binary of char * expr * expr
				972
				973	(* variant for function calls. *)
				974	\| Call of string * expr array
				975
				976	(* variant for if/then/else. *)
				977	\| If of expr * expr * expr
				978
				979	(* variant for for/in. *)
				980	\| For of string * expr * expr * expr option * expr
				981
				982	(* variant for var/in. *)
				983	\| Var of (string * expr option) array * expr
				984
				985	(* proto - This type represents the "prototype" for a function, which captures
				986	* its name, and its argument names (thus implicitly the number of arguments the
				987	* function takes). *)
				988	type proto =
				989	\| Prototype of string * string array
				990	\| BinOpPrototype of string * string array * int
				991
				992	(* func - This type represents a function definition itself. *)
				993	type func = Function of proto * expr
				994
				995	parser.ml:
				996	.. code-block:: ocaml
				997
				998	(*===---------------------------------------------------------------------===
				999	* Parser
				1000	===---------------------------------------------------------------------===)
				1001
				1002	(* binop_precedence - This holds the precedence for each binary operator that is
				1003	* defined *)
				1004	let binop_precedence:(char, int) Hashtbl.t = Hashtbl.create 10
				1005
				1006	(* precedence - Get the precedence of the pending binary operator token. *)
				1007	let precedence c = try Hashtbl.find binop_precedence c with Not_found -> -1
				1008
				1009	(* primary
				1010	* ::= identifier
				1011	* ::= numberexpr
				1012	* ::= parenexpr
				1013	* ::= ifexpr
				1014	* ::= forexpr
				1015	* ::= varexpr *)
				1016	let rec parse_primary = parser
				1017	(* numberexpr ::= number *)
				1018	\| [< 'Token.Number n >] -> Ast.Number n
				1019
				1020	(* parenexpr ::= '(' expression ')' *)
				1021	\| [< 'Token.Kwd '('; e=parse_expr; 'Token.Kwd ')' ?? "expected ')'" >] -> e
				1022
				1023	(* identifierexpr
				1024	* ::= identifier
				1025	* ::= identifier '(' argumentexpr ')' *)
				1026	\| [< 'Token.Ident id; stream >] ->
				1027	let rec parse_args accumulator = parser
				1028	\| [< e=parse_expr; stream >] ->
				1029	begin parser
				1030	\| [< 'Token.Kwd ','; e=parse_args (e :: accumulator) >] -> e
				1031	\| [< >] -> e :: accumulator
				1032	end stream
				1033	\| [< >] -> accumulator
				1034	in
				1035	let rec parse_ident id = parser
				1036	(* Call. *)
				1037	\| [< 'Token.Kwd '(';
				1038	args=parse_args [];
				1039	'Token.Kwd ')' ?? "expected ')'">] ->
				1040	Ast.Call (id, Array.of_list (List.rev args))
				1041
				1042	(* Simple variable ref. *)
				1043	\| [< >] -> Ast.Variable id
				1044	in
				1045	parse_ident id stream
				1046
				1047	(* ifexpr ::= 'if' expr 'then' expr 'else' expr *)
				1048	\| [< 'Token.If; c=parse_expr;
				1049	'Token.Then ?? "expected 'then'"; t=parse_expr;
				1050	'Token.Else ?? "expected 'else'"; e=parse_expr >] ->
				1051	Ast.If (c, t, e)
				1052
				1053	(* forexpr
				1054	::= 'for' identifier '=' expr ',' expr (',' expr)? 'in' expression *)
				1055	\| [< 'Token.For;
				1056	'Token.Ident id ?? "expected identifier after for";
				1057	'Token.Kwd '=' ?? "expected '=' after for";
				1058	stream >] ->
				1059	begin parser
				1060	\| [<
				1061	start=parse_expr;
				1062	'Token.Kwd ',' ?? "expected ',' after for";
				1063	end_=parse_expr;
				1064	stream >] ->
				1065	let step =
				1066	begin parser
				1067	\| [< 'Token.Kwd ','; step=parse_expr >] -> Some step
				1068	\| [< >] -> None
				1069	end stream
				1070	in
				1071	begin parser
				1072	\| [< 'Token.In; body=parse_expr >] ->
				1073	Ast.For (id, start, end_, step, body)
				1074	\| [< >] ->
				1075	raise (Stream.Error "expected 'in' after for")
				1076	end stream
				1077	\| [< >] ->
				1078	raise (Stream.Error "expected '=' after for")
				1079	end stream
				1080
				1081	(* varexpr
				1082	* ::= 'var' identifier ('=' expression?
				1083	* (',' identifier ('=' expression)?)* 'in' expression *)
				1084	\| [< 'Token.Var;
				1085	(* At least one variable name is required. *)
				1086	'Token.Ident id ?? "expected identifier after var";
				1087	init=parse_var_init;
				1088	var_names=parse_var_names [(id, init)];
				1089	(* At this point, we have to have 'in'. *)
				1090	'Token.In ?? "expected 'in' keyword after 'var'";
				1091	body=parse_expr >] ->
				1092	Ast.Var (Array.of_list (List.rev var_names), body)
				1093
				1094	\| [< >] -> raise (Stream.Error "unknown token when expecting an expression.")
				1095
				1096	(* unary
				1097	* ::= primary
				1098	* ::= '!' unary *)
				1099	and parse_unary = parser
				1100	(* If this is a unary operator, read it. *)
				1101	\| [< 'Token.Kwd op when op != '(' && op != ')'; operand=parse_expr >] ->
				1102	Ast.Unary (op, operand)
				1103
				1104	(* If the current token is not an operator, it must be a primary expr. *)
				1105	\| [< stream >] -> parse_primary stream
				1106
				1107	(* binoprhs
				1108	* ::= ('+' primary)* *)
				1109	and parse_bin_rhs expr_prec lhs stream =
				1110	match Stream.peek stream with
				1111	(* If this is a binop, find its precedence. *)
				1112	\| Some (Token.Kwd c) when Hashtbl.mem binop_precedence c ->
				1113	let token_prec = precedence c in
				1114
				1115	(* If this is a binop that binds at least as tightly as the current binop,
				1116	* consume it, otherwise we are done. *)
				1117	if token_prec < expr_prec then lhs else begin
				1118	(* Eat the binop. *)
				1119	Stream.junk stream;
				1120
				1121	(* Parse the primary expression after the binary operator. *)
				1122	let rhs = parse_unary stream in
				1123
				1124	(* Okay, we know this is a binop. *)
				1125	let rhs =
				1126	match Stream.peek stream with
				1127	\| Some (Token.Kwd c2) ->
				1128	(* If BinOp binds less tightly with rhs than the operator after
				1129	* rhs, let the pending operator take rhs as its lhs. *)
				1130	let next_prec = precedence c2 in
				1131	if token_prec < next_prec
				1132	then parse_bin_rhs (token_prec + 1) rhs stream
				1133	else rhs
				1134	\| _ -> rhs
				1135	in
				1136
				1137	(* Merge lhs/rhs. *)
				1138	let lhs = Ast.Binary (c, lhs, rhs) in
				1139	parse_bin_rhs expr_prec lhs stream
				1140	end
				1141	\| _ -> lhs
				1142
				1143	and parse_var_init = parser
				1144	(* read in the optional initializer. *)
				1145	\| [< 'Token.Kwd '='; e=parse_expr >] -> Some e
				1146	\| [< >] -> None
				1147
				1148	and parse_var_names accumulator = parser
				1149	\| [< 'Token.Kwd ',';
				1150	'Token.Ident id ?? "expected identifier list after var";
				1151	init=parse_var_init;
				1152	e=parse_var_names ((id, init) :: accumulator) >] -> e
				1153	\| [< >] -> accumulator
				1154
				1155	(* expression
				1156	* ::= primary binoprhs *)
				1157	and parse_expr = parser
				1158	\| [< lhs=parse_unary; stream >] -> parse_bin_rhs 0 lhs stream
				1159
				1160	(* prototype
				1161	* ::= id '(' id* ')'
				1162	* ::= binary LETTER number? (id, id)
				1163	* ::= unary LETTER number? (id) *)
				1164	let parse_prototype =
				1165	let rec parse_args accumulator = parser
				1166	\| [< 'Token.Ident id; e=parse_args (id::accumulator) >] -> e
				1167	\| [< >] -> accumulator
				1168	in
				1169	let parse_operator = parser
				1170	\| [< 'Token.Unary >] -> "unary", 1
				1171	\| [< 'Token.Binary >] -> "binary", 2
				1172	in
				1173	let parse_binary_precedence = parser
				1174	\| [< 'Token.Number n >] -> int_of_float n
				1175	\| [< >] -> 30
				1176	in
				1177	parser
				1178	\| [< 'Token.Ident id;
				1179	'Token.Kwd '(' ?? "expected '(' in prototype";
				1180	args=parse_args [];
				1181	'Token.Kwd ')' ?? "expected ')' in prototype" >] ->
				1182	(* success. *)
				1183	Ast.Prototype (id, Array.of_list (List.rev args))
				1184	\| [< (prefix, kind)=parse_operator;
				1185	'Token.Kwd op ?? "expected an operator";
				1186	(* Read the precedence if present. *)
				1187	binary_precedence=parse_binary_precedence;
				1188	'Token.Kwd '(' ?? "expected '(' in prototype";
				1189	args=parse_args [];
				1190	'Token.Kwd ')' ?? "expected ')' in prototype" >] ->
				1191	let name = prefix ^ (String.make 1 op) in
				1192	let args = Array.of_list (List.rev args) in
				1193
				1194	(* Verify right number of arguments for operator. *)
				1195	if Array.length args != kind
				1196	then raise (Stream.Error "invalid number of operands for operator")
				1197	else
				1198	if kind == 1 then
				1199	Ast.Prototype (name, args)
				1200	else
				1201	Ast.BinOpPrototype (name, args, binary_precedence)
				1202	\| [< >] ->
				1203	raise (Stream.Error "expected function name in prototype")
				1204
				1205	(* definition ::= 'def' prototype expression *)
				1206	let parse_definition = parser
				1207	\| [< 'Token.Def; p=parse_prototype; e=parse_expr >] ->
				1208	Ast.Function (p, e)
				1209
				1210	(* toplevelexpr ::= expression *)
				1211	let parse_toplevel = parser
				1212	\| [< e=parse_expr >] ->
				1213	(* Make an anonymous proto. *)
				1214	Ast.Function (Ast.Prototype ("", [\|\|]), e)
				1215
				1216	(* external ::= 'extern' prototype *)
				1217	let parse_extern = parser
				1218	\| [< 'Token.Extern; e=parse_prototype >] -> e
				1219
				1220	codegen.ml:
				1221	.. code-block:: ocaml
				1222
				1223	(*===----------------------------------------------------------------------===
				1224	* Code Generation
				1225	===----------------------------------------------------------------------===)
				1226
				1227	open Llvm
				1228
				1229	exception Error of string
				1230
				1231	let context = global_context ()
				1232	let the_module = create_module context "my cool jit"
				1233	let builder = builder context
				1234	let named_values:(string, llvalue) Hashtbl.t = Hashtbl.create 10
				1235	let double_type = double_type context
				1236
				1237	(* Create an alloca instruction in the entry block of the function. This
				1238	* is used for mutable variables etc. *)
				1239	let create_entry_block_alloca the_function var_name =
				1240	let builder = builder_at context (instr_begin (entry_block the_function)) in
				1241	build_alloca double_type var_name builder
				1242
				1243	let rec codegen_expr = function
				1244	\| Ast.Number n -> const_float double_type n
				1245	\| Ast.Variable name ->
				1246	let v = try Hashtbl.find named_values name with
				1247	\| Not_found -> raise (Error "unknown variable name")
				1248	in
				1249	(* Load the value. *)
				1250	build_load v name builder
				1251	\| Ast.Unary (op, operand) ->
				1252	let operand = codegen_expr operand in
				1253	let callee = "unary" ^ (String.make 1 op) in
				1254	let callee =
				1255	match lookup_function callee the_module with
				1256	\| Some callee -> callee
				1257	\| None -> raise (Error "unknown unary operator")
				1258	in
				1259	build_call callee [\|operand\|] "unop" builder
				1260	\| Ast.Binary (op, lhs, rhs) ->
				1261	begin match op with
				1262	\| '=' ->
				1263	(* Special case '=' because we don't want to emit the LHS as an
				1264	* expression. *)
				1265	let name =
				1266	match lhs with
				1267	\| Ast.Variable name -> name
				1268	\| _ -> raise (Error "destination of '=' must be a variable")
				1269	in
				1270
				1271	(* Codegen the rhs. *)
				1272	let val_ = codegen_expr rhs in
				1273
				1274	(* Lookup the name. *)
				1275	let variable = try Hashtbl.find named_values name with
				1276	\| Not_found -> raise (Error "unknown variable name")
				1277	in
				1278	ignore(build_store val_ variable builder);
				1279	val_
				1280	\| _ ->
				1281	let lhs_val = codegen_expr lhs in
				1282	let rhs_val = codegen_expr rhs in
				1283	begin
				1284	match op with
				1285	\| '+' -> build_add lhs_val rhs_val "addtmp" builder
				1286	\| '-' -> build_sub lhs_val rhs_val "subtmp" builder
				1287	\| '*' -> build_mul lhs_val rhs_val "multmp" builder
				1288	\| '<' ->
				1289	(* Convert bool 0/1 to double 0.0 or 1.0 *)
				1290	let i = build_fcmp Fcmp.Ult lhs_val rhs_val "cmptmp" builder in
				1291	build_uitofp i double_type "booltmp" builder
				1292	\| _ ->
				1293	(* If it wasn't a builtin binary operator, it must be a user defined
				1294	* one. Emit a call to it. *)
				1295	let callee = "binary" ^ (String.make 1 op) in
				1296	let callee =
				1297	match lookup_function callee the_module with
				1298	\| Some callee -> callee
				1299	\| None -> raise (Error "binary operator not found!")
				1300	in
				1301	build_call callee [\|lhs_val; rhs_val\|] "binop" builder
				1302	end
				1303	end
				1304	\| Ast.Call (callee, args) ->
				1305	(* Look up the name in the module table. *)
				1306	let callee =
				1307	match lookup_function callee the_module with
				1308	\| Some callee -> callee
				1309	\| None -> raise (Error "unknown function referenced")
				1310	in
				1311	let params = params callee in
				1312
				1313	(* If argument mismatch error. *)
				1314	if Array.length params == Array.length args then () else
				1315	raise (Error "incorrect # arguments passed");
				1316	let args = Array.map codegen_expr args in
				1317	build_call callee args "calltmp" builder
				1318	\| Ast.If (cond, then_, else_) ->
				1319	let cond = codegen_expr cond in
				1320
				1321	(* Convert condition to a bool by comparing equal to 0.0 *)
				1322	let zero = const_float double_type 0.0 in
				1323	let cond_val = build_fcmp Fcmp.One cond zero "ifcond" builder in
				1324
				1325	(* Grab the first block so that we might later add the conditional branch
				1326	* to it at the end of the function. *)
				1327	let start_bb = insertion_block builder in
				1328	let the_function = block_parent start_bb in
				1329
				1330	let then_bb = append_block context "then" the_function in
				1331
				1332	(* Emit 'then' value. *)
				1333	position_at_end then_bb builder;
				1334	let then_val = codegen_expr then_ in
				1335
				1336	(* Codegen of 'then' can change the current block, update then_bb for the
				1337	* phi. We create a new name because one is used for the phi node, and the
				1338	* other is used for the conditional branch. *)
				1339	let new_then_bb = insertion_block builder in
				1340
				1341	(* Emit 'else' value. *)
				1342	let else_bb = append_block context "else" the_function in
				1343	position_at_end else_bb builder;
				1344	let else_val = codegen_expr else_ in
				1345
				1346	(* Codegen of 'else' can change the current block, update else_bb for the
				1347	* phi. *)
				1348	let new_else_bb = insertion_block builder in
				1349
				1350	(* Emit merge block. *)
				1351	let merge_bb = append_block context "ifcont" the_function in
				1352	position_at_end merge_bb builder;
				1353	let incoming = [(then_val, new_then_bb); (else_val, new_else_bb)] in
				1354	let phi = build_phi incoming "iftmp" builder in
				1355
				1356	(* Return to the start block to add the conditional branch. *)
				1357	position_at_end start_bb builder;
				1358	ignore (build_cond_br cond_val then_bb else_bb builder);
				1359
				1360	(* Set a unconditional branch at the end of the 'then' block and the
				1361	* 'else' block to the 'merge' block. *)
				1362	position_at_end new_then_bb builder; ignore (build_br merge_bb builder);
				1363	position_at_end new_else_bb builder; ignore (build_br merge_bb builder);
				1364
				1365	(* Finally, set the builder to the end of the merge block. *)
				1366	position_at_end merge_bb builder;
				1367
				1368	phi
				1369	\| Ast.For (var_name, start, end_, step, body) ->
				1370	(* Output this as:
				1371	* var = alloca double
				1372	* ...
				1373	* start = startexpr
				1374	* store start -> var
				1375	* goto loop
				1376	* loop:
				1377	* ...
				1378	* bodyexpr
				1379	* ...
				1380	* loopend:
				1381	* step = stepexpr
				1382	* endcond = endexpr
				1383	*
				1384	* curvar = load var
				1385	* nextvar = curvar + step
				1386	* store nextvar -> var
				1387	* br endcond, loop, endloop
				1388	* outloop: *)
				1389
				1390	let the_function = block_parent (insertion_block builder) in
				1391
				1392	(* Create an alloca for the variable in the entry block. *)
				1393	let alloca = create_entry_block_alloca the_function var_name in
				1394
				1395	(* Emit the start code first, without 'variable' in scope. *)
				1396	let start_val = codegen_expr start in
				1397
				1398	(* Store the value into the alloca. *)
				1399	ignore(build_store start_val alloca builder);
				1400
				1401	(* Make the new basic block for the loop header, inserting after current
				1402	* block. *)
				1403	let loop_bb = append_block context "loop" the_function in
				1404
				1405	(* Insert an explicit fall through from the current block to the
				1406	* loop_bb. *)
				1407	ignore (build_br loop_bb builder);
				1408
				1409	(* Start insertion in loop_bb. *)
				1410	position_at_end loop_bb builder;
				1411
				1412	(* Within the loop, the variable is defined equal to the PHI node. If it
				1413	* shadows an existing variable, we have to restore it, so save it
				1414	* now. *)
				1415	let old_val =
				1416	try Some (Hashtbl.find named_values var_name) with Not_found -> None
				1417	in
				1418	Hashtbl.add named_values var_name alloca;
				1419
				1420	(* Emit the body of the loop. This, like any other expr, can change the
				1421	* current BB. Note that we ignore the value computed by the body, but
				1422	* don't allow an error *)
				1423	ignore (codegen_expr body);
				1424
				1425	(* Emit the step value. *)
				1426	let step_val =
				1427	match step with
				1428	\| Some step -> codegen_expr step
				1429	(* If not specified, use 1.0. *)
				1430	\| None -> const_float double_type 1.0
				1431	in
				1432
				1433	(* Compute the end condition. *)
				1434	let end_cond = codegen_expr end_ in
				1435
				1436	(* Reload, increment, and restore the alloca. This handles the case where
				1437	* the body of the loop mutates the variable. *)
				1438	let cur_var = build_load alloca var_name builder in
				1439	let next_var = build_add cur_var step_val "nextvar" builder in
				1440	ignore(build_store next_var alloca builder);
				1441
				1442	(* Convert condition to a bool by comparing equal to 0.0. *)
				1443	let zero = const_float double_type 0.0 in
				1444	let end_cond = build_fcmp Fcmp.One end_cond zero "loopcond" builder in
				1445
				1446	(* Create the "after loop" block and insert it. *)
				1447	let after_bb = append_block context "afterloop" the_function in
				1448
				1449	(* Insert the conditional branch into the end of loop_end_bb. *)
				1450	ignore (build_cond_br end_cond loop_bb after_bb builder);
				1451
				1452	(* Any new code will be inserted in after_bb. *)
				1453	position_at_end after_bb builder;
				1454
				1455	(* Restore the unshadowed variable. *)
				1456	begin match old_val with
				1457	\| Some old_val -> Hashtbl.add named_values var_name old_val
				1458	\| None -> ()
				1459	end;
				1460
				1461	(* for expr always returns 0.0. *)
				1462	const_null double_type
				1463	\| Ast.Var (var_names, body) ->
				1464	let old_bindings = ref [] in
				1465
				1466	let the_function = block_parent (insertion_block builder) in
				1467
				1468	(* Register all variables and emit their initializer. *)
				1469	Array.iter (fun (var_name, init) ->
				1470	(* Emit the initializer before adding the variable to scope, this
				1471	* prevents the initializer from referencing the variable itself, and
				1472	* permits stuff like this:
				1473	* var a = 1 in
				1474	* var a = a in ... # refers to outer 'a'. *)
				1475	let init_val =
				1476	match init with
				1477	\| Some init -> codegen_expr init
				1478	(* If not specified, use 0.0. *)
				1479	\| None -> const_float double_type 0.0
				1480	in
				1481
				1482	let alloca = create_entry_block_alloca the_function var_name in
				1483	ignore(build_store init_val alloca builder);
				1484
				1485	(* Remember the old variable binding so that we can restore the binding
				1486	* when we unrecurse. *)
				1487	begin
				1488	try
				1489	let old_value = Hashtbl.find named_values var_name in
				1490	old_bindings := (var_name, old_value) :: !old_bindings;
				1491	with Not_found -> ()
				1492	end;
				1493
				1494	(* Remember this binding. *)
				1495	Hashtbl.add named_values var_name alloca;
				1496	) var_names;
				1497
				1498	(* Codegen the body, now that all vars are in scope. *)
				1499	let body_val = codegen_expr body in
				1500
				1501	(* Pop all our variables from scope. *)
				1502	List.iter (fun (var_name, old_value) ->
				1503	Hashtbl.add named_values var_name old_value
				1504	) !old_bindings;
				1505
				1506	(* Return the body computation. *)
				1507	body_val
				1508
				1509	let codegen_proto = function
				1510	\| Ast.Prototype (name, args) \| Ast.BinOpPrototype (name, args, _) ->
				1511	(* Make the function type: double(double,double) etc. *)
				1512	let doubles = Array.make (Array.length args) double_type in
				1513	let ft = function_type double_type doubles in
				1514	let f =
				1515	match lookup_function name the_module with
				1516	\| None -> declare_function name ft the_module
				1517
				1518	(* If 'f' conflicted, there was already something named 'name'. If it
				1519	* has a body, don't allow redefinition or reextern. *)
				1520	\| Some f ->
				1521	(* If 'f' already has a body, reject this. *)
				1522	if block_begin f <> At_end f then
				1523	raise (Error "redefinition of function");
				1524
				1525	(* If 'f' took a different number of arguments, reject. *)
				1526	if element_type (type_of f) <> ft then
				1527	raise (Error "redefinition of function with different # args");
				1528	f
				1529	in
				1530
				1531	(* Set names for all arguments. *)
				1532	Array.iteri (fun i a ->
				1533	let n = args.(i) in
				1534	set_value_name n a;
				1535	Hashtbl.add named_values n a;
				1536	) (params f);
				1537	f
				1538
				1539	(* Create an alloca for each argument and register the argument in the symbol
				1540	* table so that references to it will succeed. *)
				1541	let create_argument_allocas the_function proto =
				1542	let args = match proto with
				1543	\| Ast.Prototype (_, args) \| Ast.BinOpPrototype (_, args, _) -> args
				1544	in
				1545	Array.iteri (fun i ai ->
				1546	let var_name = args.(i) in
				1547	(* Create an alloca for this variable. *)
				1548	let alloca = create_entry_block_alloca the_function var_name in
				1549
				1550	(* Store the initial value into the alloca. *)
				1551	ignore(build_store ai alloca builder);
				1552
				1553	(* Add arguments to variable symbol table. *)
				1554	Hashtbl.add named_values var_name alloca;
				1555	) (params the_function)
				1556
				1557	let codegen_func the_fpm = function
				1558	\| Ast.Function (proto, body) ->
				1559	Hashtbl.clear named_values;
				1560	let the_function = codegen_proto proto in
				1561
				1562	(* If this is an operator, install it. *)
				1563	begin match proto with
				1564	\| Ast.BinOpPrototype (name, args, prec) ->
				1565	let op = name.[String.length name - 1] in
				1566	Hashtbl.add Parser.binop_precedence op prec;
				1567	\| _ -> ()
				1568	end;
				1569
				1570	(* Create a new basic block to start insertion into. *)
				1571	let bb = append_block context "entry" the_function in
				1572	position_at_end bb builder;
				1573
				1574	try
				1575	(* Add all arguments to the symbol table and create their allocas. *)
				1576	create_argument_allocas the_function proto;
				1577
				1578	let ret_val = codegen_expr body in
				1579
				1580	(* Finish off the function. *)
				1581	let _ = build_ret ret_val builder in
				1582
				1583	(* Validate the generated code, checking for consistency. *)
				1584	Llvm_analysis.assert_valid_function the_function;
				1585
				1586	(* Optimize the function. *)
				1587	let _ = PassManager.run_function the_function the_fpm in
				1588
				1589	the_function
				1590	with e ->
				1591	delete_function the_function;
				1592	raise e
				1593
				1594	toplevel.ml:
				1595	.. code-block:: ocaml
				1596
				1597	(*===----------------------------------------------------------------------===
				1598	* Top-Level parsing and JIT Driver
				1599	===----------------------------------------------------------------------===)
				1600
				1601	open Llvm
				1602	open Llvm_executionengine
				1603
				1604	(* top ::= definition \| external \| expression \| ';' *)
				1605	let rec main_loop the_fpm the_execution_engine stream =
				1606	match Stream.peek stream with
				1607	\| None -> ()
				1608
				1609	(* ignore top-level semicolons. *)
				1610	\| Some (Token.Kwd ';') ->
				1611	Stream.junk stream;
				1612	main_loop the_fpm the_execution_engine stream
				1613
				1614	\| Some token ->
				1615	begin
				1616	try match token with
				1617	\| Token.Def ->
				1618	let e = Parser.parse_definition stream in
				1619	print_endline "parsed a function definition.";
				1620	dump_value (Codegen.codegen_func the_fpm e);
				1621	\| Token.Extern ->
				1622	let e = Parser.parse_extern stream in
				1623	print_endline "parsed an extern.";
				1624	dump_value (Codegen.codegen_proto e);
				1625	\| _ ->
				1626	(* Evaluate a top-level expression into an anonymous function. *)
				1627	let e = Parser.parse_toplevel stream in
				1628	print_endline "parsed a top-level expr";
				1629	let the_function = Codegen.codegen_func the_fpm e in
				1630	dump_value the_function;
				1631
				1632	(* JIT the function, returning a function pointer. *)
				1633	let result = ExecutionEngine.run_function the_function [\|\|]
				1634	the_execution_engine in
				1635
				1636	print_string "Evaluated to ";
				1637	print_float (GenericValue.as_float Codegen.double_type result);
				1638	print_newline ();
				1639	with Stream.Error s \| Codegen.Error s ->
				1640	(* Skip token for error recovery. *)
				1641	Stream.junk stream;
				1642	print_endline s;
				1643	end;
				1644	print_string "ready> "; flush stdout;
				1645	main_loop the_fpm the_execution_engine stream
				1646
				1647	toy.ml:
				1648	.. code-block:: ocaml
				1649
				1650	(*===----------------------------------------------------------------------===
				1651	* Main driver code.
				1652	===----------------------------------------------------------------------===)
				1653
				1654	open Llvm
				1655	open Llvm_executionengine
				1656	open Llvm_target
				1657	open Llvm_scalar_opts
				1658
				1659	let main () =
				1660	ignore (initialize_native_target ());
				1661
				1662	(* Install standard binary operators.
				1663	* 1 is the lowest precedence. *)
				1664	Hashtbl.add Parser.binop_precedence '=' 2;
				1665	Hashtbl.add Parser.binop_precedence '<' 10;
				1666	Hashtbl.add Parser.binop_precedence '+' 20;
				1667	Hashtbl.add Parser.binop_precedence '-' 20;
				1668	Hashtbl.add Parser.binop_precedence '' 40; ( highest. *)
				1669
				1670	(* Prime the first token. *)
				1671	print_string "ready> "; flush stdout;
				1672	let stream = Lexer.lex (Stream.of_channel stdin) in
				1673
				1674	(* Create the JIT. *)
				1675	let the_execution_engine = ExecutionEngine.create Codegen.the_module in
				1676	let the_fpm = PassManager.create_function Codegen.the_module in
				1677
				1678	(* Set up the optimizer pipeline. Start with registering info about how the
				1679	* target lays out data structures. *)
				1680	DataLayout.add (ExecutionEngine.target_data the_execution_engine) the_fpm;
				1681
				1682	(* Promote allocas to registers. *)
				1683	add_memory_to_register_promotion the_fpm;
				1684
				1685	(* Do simple "peephole" optimizations and bit-twiddling optzn. *)
				1686	add_instruction_combination the_fpm;
				1687
				1688	(* reassociate expressions. *)
				1689	add_reassociation the_fpm;
				1690
				1691	(* Eliminate Common SubExpressions. *)
				1692	add_gvn the_fpm;
				1693
				1694	(* Simplify the control flow graph (deleting unreachable blocks, etc). *)
				1695	add_cfg_simplification the_fpm;
				1696
				1697	ignore (PassManager.initialize the_fpm);
				1698
				1699	(* Run the main "interpreter loop" now. *)
				1700	Toplevel.main_loop the_fpm the_execution_engine stream;
				1701
				1702	(* Print out all the generated code. *)
				1703	dump_module Codegen.the_module
				1704	;;
				1705
				1706	main ()
				1707
				1708	bindings.c
				1709	.. code-block:: c
				1710
				1711	#include <stdio.h>
				1712
				1713	/* putchard - putchar that takes a double and returns 0. */
				1714	extern double putchard(double X) {
				1715	putchar((char)X);
				1716	return 0;
				1717	}
				1718
				1719	/* printd - printf that takes a double prints it as "%f\n", returning 0. */
				1720	extern double printd(double X) {
				1721	printf("%f\n", X);
				1722	return 0;
				1723	}
				1724
				1725	`Next: Conclusion and other useful LLVM tidbits <OCamlLangImpl8.html>`_
				1726