Blame - docs/tutorial/OCamlLangImpl7.rst - fp2-dev/platform/external/llvm

blob: cfb49312c50faa362a927d79caa118d617d28899 [file] [log] [blame]

Sean Silva	ee47edf	2012-12-05 00:26:32 +0000	[diff] [blame]	1	=======================================================
				2	Kaleidoscope: Extending the Language: Mutable Variables
				3	=======================================================
				4
				5	.. contents::
				6	:local:
				7
Sean Silva	ee47edf	2012-12-05 00:26:32 +0000	[diff] [blame]	8	Chapter 7 Introduction
				9	======================
				10
				11	Welcome to Chapter 7 of the "`Implementing a language with
				12	LLVM <index.html>`_" tutorial. In chapters 1 through 6, we've built a
				13	very respectable, albeit simple, `functional programming
				14	language <http://en.wikipedia.org/wiki/Functional_programming>`_. In our
				15	journey, we learned some parsing techniques, how to build and represent
				16	an AST, how to build LLVM IR, and how to optimize the resultant code as
				17	well as JIT compile it.
				18
				19	While Kaleidoscope is interesting as a functional language, the fact
				20	that it is functional makes it "too easy" to generate LLVM IR for it. In
				21	particular, a functional language makes it very easy to build LLVM IR
				22	directly in `SSA
				23	form <http://en.wikipedia.org/wiki/Static_single_assignment_form>`_.
				24	Since LLVM requires that the input code be in SSA form, this is a very
				25	nice property and it is often unclear to newcomers how to generate code
				26	for an imperative language with mutable variables.
				27
				28	The short (and happy) summary of this chapter is that there is no need
				29	for your front-end to build SSA form: LLVM provides highly tuned and
				30	well tested support for this, though the way it works is a bit
				31	unexpected for some.
				32
				33	Why is this a hard problem?
				34	===========================
				35
				36	To understand why mutable variables cause complexities in SSA
				37	construction, consider this extremely simple C example:
				38
				39	.. code-block:: c
				40
				41	int G, H;
				42	int test(_Bool Condition) {
				43	int X;
				44	if (Condition)
				45	X = G;
				46	else
				47	X = H;
				48	return X;
				49	}
				50
				51	In this case, we have the variable "X", whose value depends on the path
				52	executed in the program. Because there are two different possible values
				53	for X before the return instruction, a PHI node is inserted to merge the
				54	two values. The LLVM IR that we want for this example looks like this:
				55
				56	.. code-block:: llvm
				57
				58	@G = weak global i32 0 ; type of @G is i32*
				59	@H = weak global i32 0 ; type of @H is i32*
				60
				61	define i32 @test(i1 %Condition) {
				62	entry:
				63	br i1 %Condition, label %cond_true, label %cond_false
				64
				65	cond_true:
				66	%X.0 = load i32* @G
				67	br label %cond_next
				68
				69	cond_false:
				70	%X.1 = load i32* @H
				71	br label %cond_next
				72
				73	cond_next:
				74	%X.2 = phi i32 [ %X.1, %cond_false ], [ %X.0, %cond_true ]
				75	ret i32 %X.2
				76	}
				77
				78	In this example, the loads from the G and H global variables are
				79	explicit in the LLVM IR, and they live in the then/else branches of the
				80	if statement (cond\_true/cond\_false). In order to merge the incoming
				81	values, the X.2 phi node in the cond\_next block selects the right value
				82	to use based on where control flow is coming from: if control flow comes
				83	from the cond\_false block, X.2 gets the value of X.1. Alternatively, if
				84	control flow comes from cond\_true, it gets the value of X.0. The intent
				85	of this chapter is not to explain the details of SSA form. For more
				86	information, see one of the many `online
				87	references <http://en.wikipedia.org/wiki/Static_single_assignment_form>`_.
				88
				89	The question for this article is "who places the phi nodes when lowering
				90	assignments to mutable variables?". The issue here is that LLVM
				91	requires that its IR be in SSA form: there is no "non-ssa" mode for
				92	it. However, SSA construction requires non-trivial algorithms and data
				93	structures, so it is inconvenient and wasteful for every front-end to
				94	have to reproduce this logic.
				95
				96	Memory in LLVM
				97	==============
				98
				99	The 'trick' here is that while LLVM does require all register values to
				100	be in SSA form, it does not require (or permit) memory objects to be in
				101	SSA form. In the example above, note that the loads from G and H are
				102	direct accesses to G and H: they are not renamed or versioned. This
				103	differs from some other compiler systems, which do try to version memory
				104	objects. In LLVM, instead of encoding dataflow analysis of memory into
				105	the LLVM IR, it is handled with `Analysis
				106	Passes <../WritingAnLLVMPass.html>`_ which are computed on demand.
				107
				108	With this in mind, the high-level idea is that we want to make a stack
				109	variable (which lives in memory, because it is on the stack) for each
				110	mutable object in a function. To take advantage of this trick, we need
				111	to talk about how LLVM represents stack variables.
				112
				113	In LLVM, all memory accesses are explicit with load/store instructions,
				114	and it is carefully designed not to have (or need) an "address-of"
				115	operator. Notice how the type of the @G/@H global variables is actually
				116	"i32\*" even though the variable is defined as "i32". What this means is
				117	that @G defines space for an i32 in the global data area, but its
				118	name actually refers to the address for that space. Stack variables
				119	work the same way, except that instead of being declared with global
				120	variable definitions, they are declared with the `LLVM alloca
				121	instruction <../LangRef.html#i_alloca>`_:
				122
				123	.. code-block:: llvm
				124
				125	define i32 @example() {
				126	entry:
				127	%X = alloca i32 ; type of %X is i32*.
				128	...
				129	%tmp = load i32* %X ; load the stack value %X from the stack.
				130	%tmp2 = add i32 %tmp, 1 ; increment it
				131	store i32 %tmp2, i32* %X ; store it back
				132	...
				133
				134	This code shows an example of how you can declare and manipulate a stack
				135	variable in the LLVM IR. Stack memory allocated with the alloca
				136	instruction is fully general: you can pass the address of the stack slot
				137	to functions, you can store it in other variables, etc. In our example
				138	above, we could rewrite the example to use the alloca technique to avoid
				139	using a PHI node:
				140
				141	.. code-block:: llvm
				142
				143	@G = weak global i32 0 ; type of @G is i32*
				144	@H = weak global i32 0 ; type of @H is i32*
				145
				146	define i32 @test(i1 %Condition) {
				147	entry:
				148	%X = alloca i32 ; type of %X is i32*.
				149	br i1 %Condition, label %cond_true, label %cond_false
				150
				151	cond_true:
				152	%X.0 = load i32* @G
				153	store i32 %X.0, i32* %X ; Update X
				154	br label %cond_next
				155
				156	cond_false:
				157	%X.1 = load i32* @H
				158	store i32 %X.1, i32* %X ; Update X
				159	br label %cond_next
				160
				161	cond_next:
				162	%X.2 = load i32* %X ; Read X
				163	ret i32 %X.2
				164	}
				165
				166	With this, we have discovered a way to handle arbitrary mutable
				167	variables without the need to create Phi nodes at all:
				168
				169	#. Each mutable variable becomes a stack allocation.
				170	#. Each read of the variable becomes a load from the stack.
				171	#. Each update of the variable becomes a store to the stack.
				172	#. Taking the address of a variable just uses the stack address
				173	directly.
				174
				175	While this solution has solved our immediate problem, it introduced
				176	another one: we have now apparently introduced a lot of stack traffic
				177	for very simple and common operations, a major performance problem.
				178	Fortunately for us, the LLVM optimizer has a highly-tuned optimization
				179	pass named "mem2reg" that handles this case, promoting allocas like this
				180	into SSA registers, inserting Phi nodes as appropriate. If you run this
				181	example through the pass, for example, you'll get:
				182
				183	.. code-block:: bash
				184
				185	$ llvm-as < example.ll \| opt -mem2reg \| llvm-dis
				186	@G = weak global i32 0
				187	@H = weak global i32 0
				188
				189	define i32 @test(i1 %Condition) {
				190	entry:
				191	br i1 %Condition, label %cond_true, label %cond_false
				192
				193	cond_true:
				194	%X.0 = load i32* @G
				195	br label %cond_next
				196
				197	cond_false:
				198	%X.1 = load i32* @H
				199	br label %cond_next
				200
				201	cond_next:
				202	%X.01 = phi i32 [ %X.1, %cond_false ], [ %X.0, %cond_true ]
				203	ret i32 %X.01
				204	}
				205
				206	The mem2reg pass implements the standard "iterated dominance frontier"
				207	algorithm for constructing SSA form and has a number of optimizations
				208	that speed up (very common) degenerate cases. The mem2reg optimization
				209	pass is the answer to dealing with mutable variables, and we highly
				210	recommend that you depend on it. Note that mem2reg only works on
				211	variables in certain circumstances:
				212
				213	#. mem2reg is alloca-driven: it looks for allocas and if it can handle
				214	them, it promotes them. It does not apply to global variables or heap
				215	allocations.
				216	#. mem2reg only looks for alloca instructions in the entry block of the
				217	function. Being in the entry block guarantees that the alloca is only
				218	executed once, which makes analysis simpler.
				219	#. mem2reg only promotes allocas whose uses are direct loads and stores.
				220	If the address of the stack object is passed to a function, or if any
				221	funny pointer arithmetic is involved, the alloca will not be
				222	promoted.
				223	#. mem2reg only works on allocas of `first
				224	class <../LangRef.html#t_classifications>`_ values (such as pointers,
				225	scalars and vectors), and only if the array size of the allocation is
				226	1 (or missing in the .ll file). mem2reg is not capable of promoting
				227	structs or arrays to registers. Note that the "scalarrepl" pass is
				228	more powerful and can promote structs, "unions", and arrays in many
				229	cases.
				230
				231	All of these properties are easy to satisfy for most imperative
				232	languages, and we'll illustrate it below with Kaleidoscope. The final
				233	question you may be asking is: should I bother with this nonsense for my
				234	front-end? Wouldn't it be better if I just did SSA construction
				235	directly, avoiding use of the mem2reg optimization pass? In short, we
				236	strongly recommend that you use this technique for building SSA form,
				237	unless there is an extremely good reason not to. Using this technique
				238	is:
				239
				240	- Proven and well tested: llvm-gcc and clang both use this technique
				241	for local mutable variables. As such, the most common clients of LLVM
				242	are using this to handle a bulk of their variables. You can be sure
				243	that bugs are found fast and fixed early.
				244	- Extremely Fast: mem2reg has a number of special cases that make it
				245	fast in common cases as well as fully general. For example, it has
				246	fast-paths for variables that are only used in a single block,
				247	variables that only have one assignment point, good heuristics to
				248	avoid insertion of unneeded phi nodes, etc.
				249	- Needed for debug info generation: `Debug information in
				250	LLVM <../SourceLevelDebugging.html>`_ relies on having the address of
				251	the variable exposed so that debug info can be attached to it. This
				252	technique dovetails very naturally with this style of debug info.
				253
				254	If nothing else, this makes it much easier to get your front-end up and
				255	running, and is very simple to implement. Lets extend Kaleidoscope with
				256	mutable variables now!
				257
				258	Mutable Variables in Kaleidoscope
				259	=================================
				260
				261	Now that we know the sort of problem we want to tackle, lets see what
				262	this looks like in the context of our little Kaleidoscope language.
				263	We're going to add two features:
				264
				265	#. The ability to mutate variables with the '=' operator.
				266	#. The ability to define new variables.
				267
				268	While the first item is really what this is about, we only have
				269	variables for incoming arguments as well as for induction variables, and
				270	redefining those only goes so far :). Also, the ability to define new
				271	variables is a useful thing regardless of whether you will be mutating
				272	them. Here's a motivating example that shows how we could use these:
				273
				274	::
				275
				276	# Define ':' for sequencing: as a low-precedence operator that ignores operands
				277	# and just returns the RHS.
				278	def binary : 1 (x y) y;
				279
				280	# Recursive fib, we could do this before.
				281	def fib(x)
				282	if (x < 3) then
				283	1
				284	else
				285	fib(x-1)+fib(x-2);
				286
				287	# Iterative fib.
				288	def fibi(x)
				289	var a = 1, b = 1, c in
				290	(for i = 3, i < x in
				291	c = a + b :
				292	a = b :
				293	b = c) :
				294	b;
				295
				296	# Call it.
				297	fibi(10);
				298
				299	In order to mutate variables, we have to change our existing variables
				300	to use the "alloca trick". Once we have that, we'll add our new
				301	operator, then extend Kaleidoscope to support new variable definitions.
				302
				303	Adjusting Existing Variables for Mutation
				304	=========================================
				305
				306	The symbol table in Kaleidoscope is managed at code generation time by
				307	the '``named_values``' map. This map currently keeps track of the LLVM
				308	"Value\*" that holds the double value for the named variable. In order
				309	to support mutation, we need to change this slightly, so that it
				310	``named_values`` holds the memory location of the variable in
				311	question. Note that this change is a refactoring: it changes the
				312	structure of the code, but does not (by itself) change the behavior of
				313	the compiler. All of these changes are isolated in the Kaleidoscope code
				314	generator.
				315
				316	At this point in Kaleidoscope's development, it only supports variables
				317	for two things: incoming arguments to functions and the induction
				318	variable of 'for' loops. For consistency, we'll allow mutation of these
				319	variables in addition to other user-defined variables. This means that
				320	these will both need memory locations.
				321
				322	To start our transformation of Kaleidoscope, we'll change the
				323	``named_values`` map so that it maps to AllocaInst\* instead of Value\*.
				324	Once we do this, the C++ compiler will tell us what parts of the code we
				325	need to update:
				326
				327	Note: the ocaml bindings currently model both ``Value*``'s and
				328	``AllocInst*``'s as ``Llvm.llvalue``'s, but this may change in the future
				329	to be more type safe.
				330
				331	.. code-block:: ocaml
				332
				333	let named_values:(string, llvalue) Hashtbl.t = Hashtbl.create 10
				334
				335	Also, since we will need to create these alloca's, we'll use a helper
				336	function that ensures that the allocas are created in the entry block of
				337	the function:
				338
				339	.. code-block:: ocaml
				340
				341	(* Create an alloca instruction in the entry block of the function. This
				342	* is used for mutable variables etc. *)
				343	let create_entry_block_alloca the_function var_name =
				344	let builder = builder_at (instr_begin (entry_block the_function)) in
				345	build_alloca double_type var_name builder
				346
				347	This funny looking code creates an ``Llvm.llbuilder`` object that is
				348	pointing at the first instruction of the entry block. It then creates an
				349	alloca with the expected name and returns it. Because all values in
				350	Kaleidoscope are doubles, there is no need to pass in a type to use.
				351
				352	With this in place, the first functionality change we want to make is to
				353	variable references. In our new scheme, variables live on the stack, so
				354	code generating a reference to them actually needs to produce a load
				355	from the stack slot:
				356
				357	.. code-block:: ocaml
				358
				359	let rec codegen_expr = function
				360	...
				361	\| Ast.Variable name ->
				362	let v = try Hashtbl.find named_values name with
				363	\| Not_found -> raise (Error "unknown variable name")
				364	in
				365	(* Load the value. *)
				366	build_load v name builder
				367
				368	As you can see, this is pretty straightforward. Now we need to update
				369	the things that define the variables to set up the alloca. We'll start
				370	with ``codegen_expr Ast.For ...`` (see the `full code listing <#code>`_
				371	for the unabridged code):
				372
				373	.. code-block:: ocaml
				374
				375	\| Ast.For (var_name, start, end_, step, body) ->
				376	let the_function = block_parent (insertion_block builder) in
				377
				378	(* Create an alloca for the variable in the entry block. *)
				379	let alloca = create_entry_block_alloca the_function var_name in
				380
				381	(* Emit the start code first, without 'variable' in scope. *)
				382	let start_val = codegen_expr start in
				383
				384	(* Store the value into the alloca. *)
				385	ignore(build_store start_val alloca builder);
				386
				387	...
				388
				389	(* Within the loop, the variable is defined equal to the PHI node. If it
				390	* shadows an existing variable, we have to restore it, so save it
				391	* now. *)
				392	let old_val =
				393	try Some (Hashtbl.find named_values var_name) with Not_found -> None
				394	in
				395	Hashtbl.add named_values var_name alloca;
				396
				397	...
				398
				399	(* Compute the end condition. *)
				400	let end_cond = codegen_expr end_ in
				401
				402	(* Reload, increment, and restore the alloca. This handles the case where
				403	* the body of the loop mutates the variable. *)
				404	let cur_var = build_load alloca var_name builder in
				405	let next_var = build_add cur_var step_val "nextvar" builder in
				406	ignore(build_store next_var alloca builder);
				407	...
				408
				409	This code is virtually identical to the code `before we allowed mutable
				410	variables <OCamlLangImpl5.html#forcodegen>`_. The big difference is that
				411	we no longer have to construct a PHI node, and we use load/store to
				412	access the variable as needed.
				413
				414	To support mutable argument variables, we need to also make allocas for
				415	them. The code for this is also pretty simple:
				416
				417	.. code-block:: ocaml
				418
				419	(* Create an alloca for each argument and register the argument in the symbol
				420	* table so that references to it will succeed. *)
				421	let create_argument_allocas the_function proto =
				422	let args = match proto with
				423	\| Ast.Prototype (_, args) \| Ast.BinOpPrototype (_, args, _) -> args
				424	in
				425	Array.iteri (fun i ai ->
				426	let var_name = args.(i) in
				427	(* Create an alloca for this variable. *)
				428	let alloca = create_entry_block_alloca the_function var_name in
				429
				430	(* Store the initial value into the alloca. *)
				431	ignore(build_store ai alloca builder);
				432
				433	(* Add arguments to variable symbol table. *)
				434	Hashtbl.add named_values var_name alloca;
				435	) (params the_function)
				436
				437	For each argument, we make an alloca, store the input value to the
				438	function into the alloca, and register the alloca as the memory location
				439	for the argument. This method gets invoked by ``Codegen.codegen_func``
				440	right after it sets up the entry block for the function.
				441
				442	The final missing piece is adding the mem2reg pass, which allows us to
				443	get good codegen once again:
				444
				445	.. code-block:: ocaml
				446
				447	let main () =
				448	...
				449	let the_fpm = PassManager.create_function Codegen.the_module in
				450
				451	(* Set up the optimizer pipeline. Start with registering info about how the
				452	* target lays out data structures. *)
				453	DataLayout.add (ExecutionEngine.target_data the_execution_engine) the_fpm;
				454
				455	(* Promote allocas to registers. *)
				456	add_memory_to_register_promotion the_fpm;
				457
				458	(* Do simple "peephole" optimizations and bit-twiddling optzn. *)
				459	add_instruction_combining the_fpm;
				460
				461	(* reassociate expressions. *)
				462	add_reassociation the_fpm;
				463
				464	It is interesting to see what the code looks like before and after the
				465	mem2reg optimization runs. For example, this is the before/after code
				466	for our recursive fib function. Before the optimization:
				467
				468	.. code-block:: llvm
				469
				470	define double @fib(double %x) {
				471	entry:
				472	%x1 = alloca double
				473	store double %x, double* %x1
				474	%x2 = load double* %x1
				475	%cmptmp = fcmp ult double %x2, 3.000000e+00
				476	%booltmp = uitofp i1 %cmptmp to double
				477	%ifcond = fcmp one double %booltmp, 0.000000e+00
				478	br i1 %ifcond, label %then, label %else
				479
				480	then: ; preds = %entry
				481	br label %ifcont
				482
				483	else: ; preds = %entry
				484	%x3 = load double* %x1
				485	%subtmp = fsub double %x3, 1.000000e+00
				486	%calltmp = call double @fib(double %subtmp)
				487	%x4 = load double* %x1
				488	%subtmp5 = fsub double %x4, 2.000000e+00
				489	%calltmp6 = call double @fib(double %subtmp5)
				490	%addtmp = fadd double %calltmp, %calltmp6
				491	br label %ifcont
				492
				493	ifcont: ; preds = %else, %then
				494	%iftmp = phi double [ 1.000000e+00, %then ], [ %addtmp, %else ]
				495	ret double %iftmp
				496	}
				497
				498	Here there is only one variable (x, the input argument) but you can
				499	still see the extremely simple-minded code generation strategy we are
				500	using. In the entry block, an alloca is created, and the initial input
				501	value is stored into it. Each reference to the variable does a reload
				502	from the stack. Also, note that we didn't modify the if/then/else
				503	expression, so it still inserts a PHI node. While we could make an
				504	alloca for it, it is actually easier to create a PHI node for it, so we
				505	still just make the PHI.
				506
				507	Here is the code after the mem2reg pass runs:
				508
				509	.. code-block:: llvm
				510
				511	define double @fib(double %x) {
				512	entry:
				513	%cmptmp = fcmp ult double %x, 3.000000e+00
				514	%booltmp = uitofp i1 %cmptmp to double
				515	%ifcond = fcmp one double %booltmp, 0.000000e+00
				516	br i1 %ifcond, label %then, label %else
				517
				518	then:
				519	br label %ifcont
				520
				521	else:
				522	%subtmp = fsub double %x, 1.000000e+00
				523	%calltmp = call double @fib(double %subtmp)
				524	%subtmp5 = fsub double %x, 2.000000e+00
				525	%calltmp6 = call double @fib(double %subtmp5)
				526	%addtmp = fadd double %calltmp, %calltmp6
				527	br label %ifcont
				528
				529	ifcont: ; preds = %else, %then
				530	%iftmp = phi double [ 1.000000e+00, %then ], [ %addtmp, %else ]
				531	ret double %iftmp
				532	}
				533
				534	This is a trivial case for mem2reg, since there are no redefinitions of
				535	the variable. The point of showing this is to calm your tension about
				536	inserting such blatent inefficiencies :).
				537
				538	After the rest of the optimizers run, we get:
				539
				540	.. code-block:: llvm
				541
				542	define double @fib(double %x) {
				543	entry:
				544	%cmptmp = fcmp ult double %x, 3.000000e+00
				545	%booltmp = uitofp i1 %cmptmp to double
				546	%ifcond = fcmp ueq double %booltmp, 0.000000e+00
				547	br i1 %ifcond, label %else, label %ifcont
				548
				549	else:
				550	%subtmp = fsub double %x, 1.000000e+00
				551	%calltmp = call double @fib(double %subtmp)
				552	%subtmp5 = fsub double %x, 2.000000e+00
				553	%calltmp6 = call double @fib(double %subtmp5)
				554	%addtmp = fadd double %calltmp, %calltmp6
				555	ret double %addtmp
				556
				557	ifcont:
				558	ret double 1.000000e+00
				559	}
				560
				561	Here we see that the simplifycfg pass decided to clone the return
				562	instruction into the end of the 'else' block. This allowed it to
				563	eliminate some branches and the PHI node.
				564
				565	Now that all symbol table references are updated to use stack variables,
				566	we'll add the assignment operator.
				567
				568	New Assignment Operator
				569	=======================
				570
				571	With our current framework, adding a new assignment operator is really
				572	simple. We will parse it just like any other binary operator, but handle
				573	it internally (instead of allowing the user to define it). The first
				574	step is to set a precedence:
				575
				576	.. code-block:: ocaml
				577
				578	let main () =
				579	(* Install standard binary operators.
				580	* 1 is the lowest precedence. *)
				581	Hashtbl.add Parser.binop_precedence '=' 2;
				582	Hashtbl.add Parser.binop_precedence '<' 10;
				583	Hashtbl.add Parser.binop_precedence '+' 20;
				584	Hashtbl.add Parser.binop_precedence '-' 20;
				585	...
				586
				587	Now that the parser knows the precedence of the binary operator, it
				588	takes care of all the parsing and AST generation. We just need to
				589	implement codegen for the assignment operator. This looks like:
				590
				591	.. code-block:: ocaml
				592
				593	let rec codegen_expr = function
				594	begin match op with
				595	\| '=' ->
				596	(* Special case '=' because we don't want to emit the LHS as an
				597	* expression. *)
				598	let name =
				599	match lhs with
				600	\| Ast.Variable name -> name
				601	\| _ -> raise (Error "destination of '=' must be a variable")
				602	in
				603
				604	Unlike the rest of the binary operators, our assignment operator doesn't
				605	follow the "emit LHS, emit RHS, do computation" model. As such, it is
				606	handled as a special case before the other binary operators are handled.
				607	The other strange thing is that it requires the LHS to be a variable. It
				608	is invalid to have "(x+1) = expr" - only things like "x = expr" are
				609	allowed.
				610
				611	.. code-block:: ocaml
				612
				613	(* Codegen the rhs. *)
				614	let val_ = codegen_expr rhs in
				615
				616	(* Lookup the name. *)
				617	let variable = try Hashtbl.find named_values name with
				618	\| Not_found -> raise (Error "unknown variable name")
				619	in
				620	ignore(build_store val_ variable builder);
				621	val_
				622	\| _ ->
				623	...
				624
				625	Once we have the variable, codegen'ing the assignment is
				626	straightforward: we emit the RHS of the assignment, create a store, and
				627	return the computed value. Returning a value allows for chained
				628	assignments like "X = (Y = Z)".
				629
				630	Now that we have an assignment operator, we can mutate loop variables
				631	and arguments. For example, we can now run code like this:
				632
				633	::
				634
				635	# Function to print a double.
				636	extern printd(x);
				637
				638	# Define ':' for sequencing: as a low-precedence operator that ignores operands
				639	# and just returns the RHS.
				640	def binary : 1 (x y) y;
				641
				642	def test(x)
				643	printd(x) :
				644	x = 4 :
				645	printd(x);
				646
				647	test(123);
				648
				649	When run, this example prints "123" and then "4", showing that we did
				650	actually mutate the value! Okay, we have now officially implemented our
				651	goal: getting this to work requires SSA construction in the general
				652	case. However, to be really useful, we want the ability to define our
				653	own local variables, lets add this next!
				654
				655	User-defined Local Variables
				656	============================
				657
				658	Adding var/in is just like any other other extensions we made to
				659	Kaleidoscope: we extend the lexer, the parser, the AST and the code
				660	generator. The first step for adding our new 'var/in' construct is to
				661	extend the lexer. As before, this is pretty trivial, the code looks like
				662	this:
				663
				664	.. code-block:: ocaml
				665
				666	type token =
				667	...
				668	(* var definition *)
				669	\| Var
				670
				671	...
				672
				673	and lex_ident buffer = parser
				674	...
				675	\| "in" -> [< 'Token.In; stream >]
				676	\| "binary" -> [< 'Token.Binary; stream >]
				677	\| "unary" -> [< 'Token.Unary; stream >]
				678	\| "var" -> [< 'Token.Var; stream >]
				679	...
				680
				681	The next step is to define the AST node that we will construct. For
				682	var/in, it looks like this:
				683
				684	.. code-block:: ocaml
				685
				686	type expr =
				687	...
				688	(* variant for var/in. *)
				689	\| Var of (string * expr option) array * expr
				690	...
				691
				692	var/in allows a list of names to be defined all at once, and each name
				693	can optionally have an initializer value. As such, we capture this
				694	information in the VarNames vector. Also, var/in has a body, this body
				695	is allowed to access the variables defined by the var/in.
				696
				697	With this in place, we can define the parser pieces. The first thing we
				698	do is add it as a primary expression:
				699
				700	.. code-block:: ocaml
				701
				702	(* primary
				703	* ::= identifier
				704	* ::= numberexpr
				705	* ::= parenexpr
				706	* ::= ifexpr
				707	* ::= forexpr
				708	* ::= varexpr *)
				709	let rec parse_primary = parser
				710	...
				711	(* varexpr
				712	* ::= 'var' identifier ('=' expression?
				713	* (',' identifier ('=' expression)?)* 'in' expression *)
				714	\| [< 'Token.Var;
				715	(* At least one variable name is required. *)
				716	'Token.Ident id ?? "expected identifier after var";
				717	init=parse_var_init;
				718	var_names=parse_var_names [(id, init)];
				719	(* At this point, we have to have 'in'. *)
				720	'Token.In ?? "expected 'in' keyword after 'var'";
				721	body=parse_expr >] ->
				722	Ast.Var (Array.of_list (List.rev var_names), body)
				723
				724	...
				725
				726	and parse_var_init = parser
				727	(* read in the optional initializer. *)
				728	\| [< 'Token.Kwd '='; e=parse_expr >] -> Some e
				729	\| [< >] -> None
				730
				731	and parse_var_names accumulator = parser
				732	\| [< 'Token.Kwd ',';
				733	'Token.Ident id ?? "expected identifier list after var";
				734	init=parse_var_init;
				735	e=parse_var_names ((id, init) :: accumulator) >] -> e
				736	\| [< >] -> accumulator
				737
				738	Now that we can parse and represent the code, we need to support
				739	emission of LLVM IR for it. This code starts out with:
				740
				741	.. code-block:: ocaml
				742
				743	let rec codegen_expr = function
				744	...
				745	\| Ast.Var (var_names, body)
				746	let old_bindings = ref [] in
				747
				748	let the_function = block_parent (insertion_block builder) in
				749
				750	(* Register all variables and emit their initializer. *)
				751	Array.iter (fun (var_name, init) ->
				752
				753	Basically it loops over all the variables, installing them one at a
				754	time. For each variable we put into the symbol table, we remember the
				755	previous value that we replace in OldBindings.
				756
				757	.. code-block:: ocaml
				758
				759	(* Emit the initializer before adding the variable to scope, this
				760	* prevents the initializer from referencing the variable itself, and
				761	* permits stuff like this:
				762	* var a = 1 in
				763	* var a = a in ... # refers to outer 'a'. *)
				764	let init_val =
				765	match init with
				766	\| Some init -> codegen_expr init
				767	(* If not specified, use 0.0. *)
				768	\| None -> const_float double_type 0.0
				769	in
				770
				771	let alloca = create_entry_block_alloca the_function var_name in
				772	ignore(build_store init_val alloca builder);
				773
				774	(* Remember the old variable binding so that we can restore the binding
				775	* when we unrecurse. *)
				776
				777	begin
				778	try
				779	let old_value = Hashtbl.find named_values var_name in
				780	old_bindings := (var_name, old_value) :: !old_bindings;
				781	with Not_found > ()
				782	end;
				783
				784	(* Remember this binding. *)
				785	Hashtbl.add named_values var_name alloca;
				786	) var_names;
				787
				788	There are more comments here than code. The basic idea is that we emit
				789	the initializer, create the alloca, then update the symbol table to
				790	point to it. Once all the variables are installed in the symbol table,
				791	we evaluate the body of the var/in expression:
				792
				793	.. code-block:: ocaml
				794
				795	(* Codegen the body, now that all vars are in scope. *)
				796	let body_val = codegen_expr body in
				797
				798	Finally, before returning, we restore the previous variable bindings:
				799
				800	.. code-block:: ocaml
				801
				802	(* Pop all our variables from scope. *)
				803	List.iter (fun (var_name, old_value) ->
				804	Hashtbl.add named_values var_name old_value
				805	) !old_bindings;
				806
				807	(* Return the body computation. *)
				808	body_val
				809
				810	The end result of all of this is that we get properly scoped variable
				811	definitions, and we even (trivially) allow mutation of them :).
				812
				813	With this, we completed what we set out to do. Our nice iterative fib
				814	example from the intro compiles and runs just fine. The mem2reg pass
				815	optimizes all of our stack variables into SSA registers, inserting PHI
				816	nodes where needed, and our front-end remains simple: no "iterated
				817	dominance frontier" computation anywhere in sight.
				818
				819	Full Code Listing
				820	=================
				821
				822	Here is the complete code listing for our running example, enhanced with
				823	mutable variables and var/in support. To build this example, use:
				824
				825	.. code-block:: bash
				826
				827	# Compile
				828	ocamlbuild toy.byte
				829	# Run
				830	./toy.byte
				831
				832	Here is the code:
				833
				834	\_tags:
				835	::
				836
				837	<{lexer,parser}.ml>: use_camlp4, pp(camlp4of)
				838	<*.{byte,native}>: g++, use_llvm, use_llvm_analysis
				839	<*.{byte,native}>: use_llvm_executionengine, use_llvm_target
				840	<*.{byte,native}>: use_llvm_scalar_opts, use_bindings
				841
				842	myocamlbuild.ml:
				843	.. code-block:: ocaml
				844
				845	open Ocamlbuild_plugin;;
				846
				847	ocaml_lib ~extern:true "llvm";;
				848	ocaml_lib ~extern:true "llvm_analysis";;
				849	ocaml_lib ~extern:true "llvm_executionengine";;
				850	ocaml_lib ~extern:true "llvm_target";;
				851	ocaml_lib ~extern:true "llvm_scalar_opts";;
				852
				853	flag ["link"; "ocaml"; "g++"] (S[A"-cc"; A"g++"; A"-cclib"; A"-rdynamic"]);;
				854	dep ["link"; "ocaml"; "use_bindings"] ["bindings.o"];;
				855
				856	token.ml:
				857	.. code-block:: ocaml
				858
				859	(*===----------------------------------------------------------------------===
				860	* Lexer Tokens
				861	===----------------------------------------------------------------------===)
				862
				863	(* The lexer returns these 'Kwd' if it is an unknown character, otherwise one of
				864	* these others for known things. *)
				865	type token =
				866	(* commands *)
				867	\| Def \| Extern
				868
				869	(* primary *)
				870	\| Ident of string \| Number of float
				871
				872	(* unknown *)
				873	\| Kwd of char
				874
				875	(* control *)
				876	\| If \| Then \| Else
				877	\| For \| In
				878
				879	(* operators *)
				880	\| Binary \| Unary
				881
				882	(* var definition *)
				883	\| Var
				884
				885	lexer.ml:
				886	.. code-block:: ocaml
				887
				888	(*===----------------------------------------------------------------------===
				889	* Lexer
				890	===----------------------------------------------------------------------===)
				891
				892	let rec lex = parser
				893	(* Skip any whitespace. *)
				894	\| [< ' (' ' \| '\n' \| '\r' \| '\t'); stream >] -> lex stream
				895
				896	(* identifier: [a-zA-Z][a-zA-Z0-9] *)
				897	\| [< ' ('A' .. 'Z' \| 'a' .. 'z' as c); stream >] ->
				898	let buffer = Buffer.create 1 in
				899	Buffer.add_char buffer c;
				900	lex_ident buffer stream
				901
				902	(* number: [0-9.]+ *)
				903	\| [< ' ('0' .. '9' as c); stream >] ->
				904	let buffer = Buffer.create 1 in
				905	Buffer.add_char buffer c;
				906	lex_number buffer stream
				907
				908	(* Comment until end of line. *)
				909	\| [< ' ('#'); stream >] ->
				910	lex_comment stream
				911
				912	(* Otherwise, just return the character as its ascii value. *)
				913	\| [< 'c; stream >] ->
				914	[< 'Token.Kwd c; lex stream >]
				915
				916	(* end of stream. *)
				917	\| [< >] -> [< >]
				918
				919	and lex_number buffer = parser
				920	\| [< ' ('0' .. '9' \| '.' as c); stream >] ->
				921	Buffer.add_char buffer c;
				922	lex_number buffer stream
				923	\| [< stream=lex >] ->
				924	[< 'Token.Number (float_of_string (Buffer.contents buffer)); stream >]
				925
				926	and lex_ident buffer = parser
				927	\| [< ' ('A' .. 'Z' \| 'a' .. 'z' \| '0' .. '9' as c); stream >] ->
				928	Buffer.add_char buffer c;
				929	lex_ident buffer stream
				930	\| [< stream=lex >] ->
				931	match Buffer.contents buffer with
				932	\| "def" -> [< 'Token.Def; stream >]
				933	\| "extern" -> [< 'Token.Extern; stream >]
				934	\| "if" -> [< 'Token.If; stream >]
				935	\| "then" -> [< 'Token.Then; stream >]
				936	\| "else" -> [< 'Token.Else; stream >]
				937	\| "for" -> [< 'Token.For; stream >]
				938	\| "in" -> [< 'Token.In; stream >]
				939	\| "binary" -> [< 'Token.Binary; stream >]
				940	\| "unary" -> [< 'Token.Unary; stream >]
				941	\| "var" -> [< 'Token.Var; stream >]
				942	\| id -> [< 'Token.Ident id; stream >]
				943
				944	and lex_comment = parser
				945	\| [< ' ('\n'); stream=lex >] -> stream
				946	\| [< 'c; e=lex_comment >] -> e
				947	\| [< >] -> [< >]
				948
				949	ast.ml:
				950	.. code-block:: ocaml
				951
				952	(*===----------------------------------------------------------------------===
				953	* Abstract Syntax Tree (aka Parse Tree)
				954	===----------------------------------------------------------------------===)
				955
				956	(* expr - Base type for all expression nodes. *)
				957	type expr =
				958	(* variant for numeric literals like "1.0". *)
				959	\| Number of float
				960
				961	(* variant for referencing a variable, like "a". *)
				962	\| Variable of string
				963
				964	(* variant for a unary operator. *)
				965	\| Unary of char * expr
				966
				967	(* variant for a binary operator. *)
				968	\| Binary of char * expr * expr
				969
				970	(* variant for function calls. *)
				971	\| Call of string * expr array
				972
				973	(* variant for if/then/else. *)
				974	\| If of expr * expr * expr
				975
				976	(* variant for for/in. *)
				977	\| For of string * expr * expr * expr option * expr
				978
				979	(* variant for var/in. *)
				980	\| Var of (string * expr option) array * expr
				981
				982	(* proto - This type represents the "prototype" for a function, which captures
				983	* its name, and its argument names (thus implicitly the number of arguments the
				984	* function takes). *)
				985	type proto =
				986	\| Prototype of string * string array
				987	\| BinOpPrototype of string * string array * int
				988
				989	(* func - This type represents a function definition itself. *)
				990	type func = Function of proto * expr
				991
				992	parser.ml:
				993	.. code-block:: ocaml
				994
				995	(*===---------------------------------------------------------------------===
				996	* Parser
				997	===---------------------------------------------------------------------===)
				998
				999	(* binop_precedence - This holds the precedence for each binary operator that is
				1000	* defined *)
				1001	let binop_precedence:(char, int) Hashtbl.t = Hashtbl.create 10
				1002
				1003	(* precedence - Get the precedence of the pending binary operator token. *)
				1004	let precedence c = try Hashtbl.find binop_precedence c with Not_found -> -1
				1005
				1006	(* primary
				1007	* ::= identifier
				1008	* ::= numberexpr
				1009	* ::= parenexpr
				1010	* ::= ifexpr
				1011	* ::= forexpr
				1012	* ::= varexpr *)
				1013	let rec parse_primary = parser
				1014	(* numberexpr ::= number *)
				1015	\| [< 'Token.Number n >] -> Ast.Number n
				1016
				1017	(* parenexpr ::= '(' expression ')' *)
				1018	\| [< 'Token.Kwd '('; e=parse_expr; 'Token.Kwd ')' ?? "expected ')'" >] -> e
				1019
				1020	(* identifierexpr
				1021	* ::= identifier
				1022	* ::= identifier '(' argumentexpr ')' *)
				1023	\| [< 'Token.Ident id; stream >] ->
				1024	let rec parse_args accumulator = parser
				1025	\| [< e=parse_expr; stream >] ->
				1026	begin parser
				1027	\| [< 'Token.Kwd ','; e=parse_args (e :: accumulator) >] -> e
				1028	\| [< >] -> e :: accumulator
				1029	end stream
				1030	\| [< >] -> accumulator
				1031	in
				1032	let rec parse_ident id = parser
				1033	(* Call. *)
				1034	\| [< 'Token.Kwd '(';
				1035	args=parse_args [];
				1036	'Token.Kwd ')' ?? "expected ')'">] ->
				1037	Ast.Call (id, Array.of_list (List.rev args))
				1038
				1039	(* Simple variable ref. *)
				1040	\| [< >] -> Ast.Variable id
				1041	in
				1042	parse_ident id stream
				1043
				1044	(* ifexpr ::= 'if' expr 'then' expr 'else' expr *)
				1045	\| [< 'Token.If; c=parse_expr;
				1046	'Token.Then ?? "expected 'then'"; t=parse_expr;
				1047	'Token.Else ?? "expected 'else'"; e=parse_expr >] ->
				1048	Ast.If (c, t, e)
				1049
				1050	(* forexpr
				1051	::= 'for' identifier '=' expr ',' expr (',' expr)? 'in' expression *)
				1052	\| [< 'Token.For;
				1053	'Token.Ident id ?? "expected identifier after for";
				1054	'Token.Kwd '=' ?? "expected '=' after for";
				1055	stream >] ->
				1056	begin parser
				1057	\| [<
				1058	start=parse_expr;
				1059	'Token.Kwd ',' ?? "expected ',' after for";
				1060	end_=parse_expr;
				1061	stream >] ->
				1062	let step =
				1063	begin parser
				1064	\| [< 'Token.Kwd ','; step=parse_expr >] -> Some step
				1065	\| [< >] -> None
				1066	end stream
				1067	in
				1068	begin parser
				1069	\| [< 'Token.In; body=parse_expr >] ->
				1070	Ast.For (id, start, end_, step, body)
				1071	\| [< >] ->
				1072	raise (Stream.Error "expected 'in' after for")
				1073	end stream
				1074	\| [< >] ->
				1075	raise (Stream.Error "expected '=' after for")
				1076	end stream
				1077
				1078	(* varexpr
				1079	* ::= 'var' identifier ('=' expression?
				1080	* (',' identifier ('=' expression)?)* 'in' expression *)
				1081	\| [< 'Token.Var;
				1082	(* At least one variable name is required. *)
				1083	'Token.Ident id ?? "expected identifier after var";
				1084	init=parse_var_init;
				1085	var_names=parse_var_names [(id, init)];
				1086	(* At this point, we have to have 'in'. *)
				1087	'Token.In ?? "expected 'in' keyword after 'var'";
				1088	body=parse_expr >] ->
				1089	Ast.Var (Array.of_list (List.rev var_names), body)
				1090
				1091	\| [< >] -> raise (Stream.Error "unknown token when expecting an expression.")
				1092
				1093	(* unary
				1094	* ::= primary
				1095	* ::= '!' unary *)
				1096	and parse_unary = parser
				1097	(* If this is a unary operator, read it. *)
				1098	\| [< 'Token.Kwd op when op != '(' && op != ')'; operand=parse_expr >] ->
				1099	Ast.Unary (op, operand)
				1100
				1101	(* If the current token is not an operator, it must be a primary expr. *)
				1102	\| [< stream >] -> parse_primary stream
				1103
				1104	(* binoprhs
				1105	* ::= ('+' primary)* *)
				1106	and parse_bin_rhs expr_prec lhs stream =
				1107	match Stream.peek stream with
				1108	(* If this is a binop, find its precedence. *)
				1109	\| Some (Token.Kwd c) when Hashtbl.mem binop_precedence c ->
				1110	let token_prec = precedence c in
				1111
				1112	(* If this is a binop that binds at least as tightly as the current binop,
				1113	* consume it, otherwise we are done. *)
				1114	if token_prec < expr_prec then lhs else begin
				1115	(* Eat the binop. *)
				1116	Stream.junk stream;
				1117
				1118	(* Parse the primary expression after the binary operator. *)
				1119	let rhs = parse_unary stream in
				1120
				1121	(* Okay, we know this is a binop. *)
				1122	let rhs =
				1123	match Stream.peek stream with
				1124	\| Some (Token.Kwd c2) ->
				1125	(* If BinOp binds less tightly with rhs than the operator after
				1126	* rhs, let the pending operator take rhs as its lhs. *)
				1127	let next_prec = precedence c2 in
				1128	if token_prec < next_prec
				1129	then parse_bin_rhs (token_prec + 1) rhs stream
				1130	else rhs
				1131	\| _ -> rhs
				1132	in
				1133
				1134	(* Merge lhs/rhs. *)
				1135	let lhs = Ast.Binary (c, lhs, rhs) in
				1136	parse_bin_rhs expr_prec lhs stream
				1137	end
				1138	\| _ -> lhs
				1139
				1140	and parse_var_init = parser
				1141	(* read in the optional initializer. *)
				1142	\| [< 'Token.Kwd '='; e=parse_expr >] -> Some e
				1143	\| [< >] -> None
				1144
				1145	and parse_var_names accumulator = parser
				1146	\| [< 'Token.Kwd ',';
				1147	'Token.Ident id ?? "expected identifier list after var";
				1148	init=parse_var_init;
				1149	e=parse_var_names ((id, init) :: accumulator) >] -> e
				1150	\| [< >] -> accumulator
				1151
				1152	(* expression
				1153	* ::= primary binoprhs *)
				1154	and parse_expr = parser
				1155	\| [< lhs=parse_unary; stream >] -> parse_bin_rhs 0 lhs stream
				1156
				1157	(* prototype
				1158	* ::= id '(' id* ')'
				1159	* ::= binary LETTER number? (id, id)
				1160	* ::= unary LETTER number? (id) *)
				1161	let parse_prototype =
				1162	let rec parse_args accumulator = parser
				1163	\| [< 'Token.Ident id; e=parse_args (id::accumulator) >] -> e
				1164	\| [< >] -> accumulator
				1165	in
				1166	let parse_operator = parser
				1167	\| [< 'Token.Unary >] -> "unary", 1
				1168	\| [< 'Token.Binary >] -> "binary", 2
				1169	in
				1170	let parse_binary_precedence = parser
				1171	\| [< 'Token.Number n >] -> int_of_float n
				1172	\| [< >] -> 30
				1173	in
				1174	parser
				1175	\| [< 'Token.Ident id;
				1176	'Token.Kwd '(' ?? "expected '(' in prototype";
				1177	args=parse_args [];
				1178	'Token.Kwd ')' ?? "expected ')' in prototype" >] ->
				1179	(* success. *)
				1180	Ast.Prototype (id, Array.of_list (List.rev args))
				1181	\| [< (prefix, kind)=parse_operator;
				1182	'Token.Kwd op ?? "expected an operator";
				1183	(* Read the precedence if present. *)
				1184	binary_precedence=parse_binary_precedence;
				1185	'Token.Kwd '(' ?? "expected '(' in prototype";
				1186	args=parse_args [];
				1187	'Token.Kwd ')' ?? "expected ')' in prototype" >] ->
				1188	let name = prefix ^ (String.make 1 op) in
				1189	let args = Array.of_list (List.rev args) in
				1190
				1191	(* Verify right number of arguments for operator. *)
				1192	if Array.length args != kind
				1193	then raise (Stream.Error "invalid number of operands for operator")
				1194	else
				1195	if kind == 1 then
				1196	Ast.Prototype (name, args)
				1197	else
				1198	Ast.BinOpPrototype (name, args, binary_precedence)
				1199	\| [< >] ->
				1200	raise (Stream.Error "expected function name in prototype")
				1201
				1202	(* definition ::= 'def' prototype expression *)
				1203	let parse_definition = parser
				1204	\| [< 'Token.Def; p=parse_prototype; e=parse_expr >] ->
				1205	Ast.Function (p, e)
				1206
				1207	(* toplevelexpr ::= expression *)
				1208	let parse_toplevel = parser
				1209	\| [< e=parse_expr >] ->
				1210	(* Make an anonymous proto. *)
				1211	Ast.Function (Ast.Prototype ("", [\|\|]), e)
				1212
				1213	(* external ::= 'extern' prototype *)
				1214	let parse_extern = parser
				1215	\| [< 'Token.Extern; e=parse_prototype >] -> e
				1216
				1217	codegen.ml:
				1218	.. code-block:: ocaml
				1219
				1220	(*===----------------------------------------------------------------------===
				1221	* Code Generation
				1222	===----------------------------------------------------------------------===)
				1223
				1224	open Llvm
				1225
				1226	exception Error of string
				1227
				1228	let context = global_context ()
				1229	let the_module = create_module context "my cool jit"
				1230	let builder = builder context
				1231	let named_values:(string, llvalue) Hashtbl.t = Hashtbl.create 10
				1232	let double_type = double_type context
				1233
				1234	(* Create an alloca instruction in the entry block of the function. This
				1235	* is used for mutable variables etc. *)
				1236	let create_entry_block_alloca the_function var_name =
				1237	let builder = builder_at context (instr_begin (entry_block the_function)) in
				1238	build_alloca double_type var_name builder
				1239
				1240	let rec codegen_expr = function
				1241	\| Ast.Number n -> const_float double_type n
				1242	\| Ast.Variable name ->
				1243	let v = try Hashtbl.find named_values name with
				1244	\| Not_found -> raise (Error "unknown variable name")
				1245	in
				1246	(* Load the value. *)
				1247	build_load v name builder
				1248	\| Ast.Unary (op, operand) ->
				1249	let operand = codegen_expr operand in
				1250	let callee = "unary" ^ (String.make 1 op) in
				1251	let callee =
				1252	match lookup_function callee the_module with
				1253	\| Some callee -> callee
				1254	\| None -> raise (Error "unknown unary operator")
				1255	in
				1256	build_call callee [\|operand\|] "unop" builder
				1257	\| Ast.Binary (op, lhs, rhs) ->
				1258	begin match op with
				1259	\| '=' ->
				1260	(* Special case '=' because we don't want to emit the LHS as an
				1261	* expression. *)
				1262	let name =
				1263	match lhs with
				1264	\| Ast.Variable name -> name
				1265	\| _ -> raise (Error "destination of '=' must be a variable")
				1266	in
				1267
				1268	(* Codegen the rhs. *)
				1269	let val_ = codegen_expr rhs in
				1270
				1271	(* Lookup the name. *)
				1272	let variable = try Hashtbl.find named_values name with
				1273	\| Not_found -> raise (Error "unknown variable name")
				1274	in
				1275	ignore(build_store val_ variable builder);
				1276	val_
				1277	\| _ ->
				1278	let lhs_val = codegen_expr lhs in
				1279	let rhs_val = codegen_expr rhs in
				1280	begin
				1281	match op with
				1282	\| '+' -> build_add lhs_val rhs_val "addtmp" builder
				1283	\| '-' -> build_sub lhs_val rhs_val "subtmp" builder
				1284	\| '*' -> build_mul lhs_val rhs_val "multmp" builder
				1285	\| '<' ->
				1286	(* Convert bool 0/1 to double 0.0 or 1.0 *)
				1287	let i = build_fcmp Fcmp.Ult lhs_val rhs_val "cmptmp" builder in
				1288	build_uitofp i double_type "booltmp" builder
				1289	\| _ ->
				1290	(* If it wasn't a builtin binary operator, it must be a user defined
				1291	* one. Emit a call to it. *)
				1292	let callee = "binary" ^ (String.make 1 op) in
				1293	let callee =
				1294	match lookup_function callee the_module with
				1295	\| Some callee -> callee
				1296	\| None -> raise (Error "binary operator not found!")
				1297	in
				1298	build_call callee [\|lhs_val; rhs_val\|] "binop" builder
				1299	end
				1300	end
				1301	\| Ast.Call (callee, args) ->
				1302	(* Look up the name in the module table. *)
				1303	let callee =
				1304	match lookup_function callee the_module with
				1305	\| Some callee -> callee
				1306	\| None -> raise (Error "unknown function referenced")
				1307	in
				1308	let params = params callee in
				1309
				1310	(* If argument mismatch error. *)
				1311	if Array.length params == Array.length args then () else
				1312	raise (Error "incorrect # arguments passed");
				1313	let args = Array.map codegen_expr args in
				1314	build_call callee args "calltmp" builder
				1315	\| Ast.If (cond, then_, else_) ->
				1316	let cond = codegen_expr cond in
				1317
				1318	(* Convert condition to a bool by comparing equal to 0.0 *)
				1319	let zero = const_float double_type 0.0 in
				1320	let cond_val = build_fcmp Fcmp.One cond zero "ifcond" builder in
				1321
				1322	(* Grab the first block so that we might later add the conditional branch
				1323	* to it at the end of the function. *)
				1324	let start_bb = insertion_block builder in
				1325	let the_function = block_parent start_bb in
				1326
				1327	let then_bb = append_block context "then" the_function in
				1328
				1329	(* Emit 'then' value. *)
				1330	position_at_end then_bb builder;
				1331	let then_val = codegen_expr then_ in
				1332
				1333	(* Codegen of 'then' can change the current block, update then_bb for the
				1334	* phi. We create a new name because one is used for the phi node, and the
				1335	* other is used for the conditional branch. *)
				1336	let new_then_bb = insertion_block builder in
				1337
				1338	(* Emit 'else' value. *)
				1339	let else_bb = append_block context "else" the_function in
				1340	position_at_end else_bb builder;
				1341	let else_val = codegen_expr else_ in
				1342
				1343	(* Codegen of 'else' can change the current block, update else_bb for the
				1344	* phi. *)
				1345	let new_else_bb = insertion_block builder in
				1346
				1347	(* Emit merge block. *)
				1348	let merge_bb = append_block context "ifcont" the_function in
				1349	position_at_end merge_bb builder;
				1350	let incoming = [(then_val, new_then_bb); (else_val, new_else_bb)] in
				1351	let phi = build_phi incoming "iftmp" builder in
				1352
				1353	(* Return to the start block to add the conditional branch. *)
				1354	position_at_end start_bb builder;
				1355	ignore (build_cond_br cond_val then_bb else_bb builder);
				1356
				1357	(* Set a unconditional branch at the end of the 'then' block and the
				1358	* 'else' block to the 'merge' block. *)
				1359	position_at_end new_then_bb builder; ignore (build_br merge_bb builder);
				1360	position_at_end new_else_bb builder; ignore (build_br merge_bb builder);
				1361
				1362	(* Finally, set the builder to the end of the merge block. *)
				1363	position_at_end merge_bb builder;
				1364
				1365	phi
				1366	\| Ast.For (var_name, start, end_, step, body) ->
				1367	(* Output this as:
				1368	* var = alloca double
				1369	* ...
				1370	* start = startexpr
				1371	* store start -> var
				1372	* goto loop
				1373	* loop:
				1374	* ...
				1375	* bodyexpr
				1376	* ...
				1377	* loopend:
				1378	* step = stepexpr
				1379	* endcond = endexpr
				1380	*
				1381	* curvar = load var
				1382	* nextvar = curvar + step
				1383	* store nextvar -> var
				1384	* br endcond, loop, endloop
				1385	* outloop: *)
				1386
				1387	let the_function = block_parent (insertion_block builder) in
				1388
				1389	(* Create an alloca for the variable in the entry block. *)
				1390	let alloca = create_entry_block_alloca the_function var_name in
				1391
				1392	(* Emit the start code first, without 'variable' in scope. *)
				1393	let start_val = codegen_expr start in
				1394
				1395	(* Store the value into the alloca. *)
				1396	ignore(build_store start_val alloca builder);
				1397
				1398	(* Make the new basic block for the loop header, inserting after current
				1399	* block. *)
				1400	let loop_bb = append_block context "loop" the_function in
				1401
				1402	(* Insert an explicit fall through from the current block to the
				1403	* loop_bb. *)
				1404	ignore (build_br loop_bb builder);
				1405
				1406	(* Start insertion in loop_bb. *)
				1407	position_at_end loop_bb builder;
				1408
				1409	(* Within the loop, the variable is defined equal to the PHI node. If it
				1410	* shadows an existing variable, we have to restore it, so save it
				1411	* now. *)
				1412	let old_val =
				1413	try Some (Hashtbl.find named_values var_name) with Not_found -> None
				1414	in
				1415	Hashtbl.add named_values var_name alloca;
				1416
				1417	(* Emit the body of the loop. This, like any other expr, can change the
				1418	* current BB. Note that we ignore the value computed by the body, but
				1419	* don't allow an error *)
				1420	ignore (codegen_expr body);
				1421
				1422	(* Emit the step value. *)
				1423	let step_val =
				1424	match step with
				1425	\| Some step -> codegen_expr step
				1426	(* If not specified, use 1.0. *)
				1427	\| None -> const_float double_type 1.0
				1428	in
				1429
				1430	(* Compute the end condition. *)
				1431	let end_cond = codegen_expr end_ in
				1432
				1433	(* Reload, increment, and restore the alloca. This handles the case where
				1434	* the body of the loop mutates the variable. *)
				1435	let cur_var = build_load alloca var_name builder in
				1436	let next_var = build_add cur_var step_val "nextvar" builder in
				1437	ignore(build_store next_var alloca builder);
				1438
				1439	(* Convert condition to a bool by comparing equal to 0.0. *)
				1440	let zero = const_float double_type 0.0 in
				1441	let end_cond = build_fcmp Fcmp.One end_cond zero "loopcond" builder in
				1442
				1443	(* Create the "after loop" block and insert it. *)
				1444	let after_bb = append_block context "afterloop" the_function in
				1445
				1446	(* Insert the conditional branch into the end of loop_end_bb. *)
				1447	ignore (build_cond_br end_cond loop_bb after_bb builder);
				1448
				1449	(* Any new code will be inserted in after_bb. *)
				1450	position_at_end after_bb builder;
				1451
				1452	(* Restore the unshadowed variable. *)
				1453	begin match old_val with
				1454	\| Some old_val -> Hashtbl.add named_values var_name old_val
				1455	\| None -> ()
				1456	end;
				1457
				1458	(* for expr always returns 0.0. *)
				1459	const_null double_type
				1460	\| Ast.Var (var_names, body) ->
				1461	let old_bindings = ref [] in
				1462
				1463	let the_function = block_parent (insertion_block builder) in
				1464
				1465	(* Register all variables and emit their initializer. *)
				1466	Array.iter (fun (var_name, init) ->
				1467	(* Emit the initializer before adding the variable to scope, this
				1468	* prevents the initializer from referencing the variable itself, and
				1469	* permits stuff like this:
				1470	* var a = 1 in
				1471	* var a = a in ... # refers to outer 'a'. *)
				1472	let init_val =
				1473	match init with
				1474	\| Some init -> codegen_expr init
				1475	(* If not specified, use 0.0. *)
				1476	\| None -> const_float double_type 0.0
				1477	in
				1478
				1479	let alloca = create_entry_block_alloca the_function var_name in
				1480	ignore(build_store init_val alloca builder);
				1481
				1482	(* Remember the old variable binding so that we can restore the binding
				1483	* when we unrecurse. *)
				1484	begin
				1485	try
				1486	let old_value = Hashtbl.find named_values var_name in
				1487	old_bindings := (var_name, old_value) :: !old_bindings;
				1488	with Not_found -> ()
				1489	end;
				1490
				1491	(* Remember this binding. *)
				1492	Hashtbl.add named_values var_name alloca;
				1493	) var_names;
				1494
				1495	(* Codegen the body, now that all vars are in scope. *)
				1496	let body_val = codegen_expr body in
				1497
				1498	(* Pop all our variables from scope. *)
				1499	List.iter (fun (var_name, old_value) ->
				1500	Hashtbl.add named_values var_name old_value
				1501	) !old_bindings;
				1502
				1503	(* Return the body computation. *)
				1504	body_val
				1505
				1506	let codegen_proto = function
				1507	\| Ast.Prototype (name, args) \| Ast.BinOpPrototype (name, args, _) ->
				1508	(* Make the function type: double(double,double) etc. *)
				1509	let doubles = Array.make (Array.length args) double_type in
				1510	let ft = function_type double_type doubles in
				1511	let f =
				1512	match lookup_function name the_module with
				1513	\| None -> declare_function name ft the_module
				1514
				1515	(* If 'f' conflicted, there was already something named 'name'. If it
				1516	* has a body, don't allow redefinition or reextern. *)
				1517	\| Some f ->
				1518	(* If 'f' already has a body, reject this. *)
				1519	if block_begin f <> At_end f then
				1520	raise (Error "redefinition of function");
				1521
				1522	(* If 'f' took a different number of arguments, reject. *)
				1523	if element_type (type_of f) <> ft then
				1524	raise (Error "redefinition of function with different # args");
				1525	f
				1526	in
				1527
				1528	(* Set names for all arguments. *)
				1529	Array.iteri (fun i a ->
				1530	let n = args.(i) in
				1531	set_value_name n a;
				1532	Hashtbl.add named_values n a;
				1533	) (params f);
				1534	f
				1535
				1536	(* Create an alloca for each argument and register the argument in the symbol
				1537	* table so that references to it will succeed. *)
				1538	let create_argument_allocas the_function proto =
				1539	let args = match proto with
				1540	\| Ast.Prototype (_, args) \| Ast.BinOpPrototype (_, args, _) -> args
				1541	in
				1542	Array.iteri (fun i ai ->
				1543	let var_name = args.(i) in
				1544	(* Create an alloca for this variable. *)
				1545	let alloca = create_entry_block_alloca the_function var_name in
				1546
				1547	(* Store the initial value into the alloca. *)
				1548	ignore(build_store ai alloca builder);
				1549
				1550	(* Add arguments to variable symbol table. *)
				1551	Hashtbl.add named_values var_name alloca;
				1552	) (params the_function)
				1553
				1554	let codegen_func the_fpm = function
				1555	\| Ast.Function (proto, body) ->
				1556	Hashtbl.clear named_values;
				1557	let the_function = codegen_proto proto in
				1558
				1559	(* If this is an operator, install it. *)
				1560	begin match proto with
				1561	\| Ast.BinOpPrototype (name, args, prec) ->
				1562	let op = name.[String.length name - 1] in
				1563	Hashtbl.add Parser.binop_precedence op prec;
				1564	\| _ -> ()
				1565	end;
				1566
				1567	(* Create a new basic block to start insertion into. *)
				1568	let bb = append_block context "entry" the_function in
				1569	position_at_end bb builder;
				1570
				1571	try
				1572	(* Add all arguments to the symbol table and create their allocas. *)
				1573	create_argument_allocas the_function proto;
				1574
				1575	let ret_val = codegen_expr body in
				1576
				1577	(* Finish off the function. *)
				1578	let _ = build_ret ret_val builder in
				1579
				1580	(* Validate the generated code, checking for consistency. *)
				1581	Llvm_analysis.assert_valid_function the_function;
				1582
				1583	(* Optimize the function. *)
				1584	let _ = PassManager.run_function the_function the_fpm in
				1585
				1586	the_function
				1587	with e ->
				1588	delete_function the_function;
				1589	raise e
				1590
				1591	toplevel.ml:
				1592	.. code-block:: ocaml
				1593
				1594	(*===----------------------------------------------------------------------===
				1595	* Top-Level parsing and JIT Driver
				1596	===----------------------------------------------------------------------===)
				1597
				1598	open Llvm
				1599	open Llvm_executionengine
				1600
				1601	(* top ::= definition \| external \| expression \| ';' *)
				1602	let rec main_loop the_fpm the_execution_engine stream =
				1603	match Stream.peek stream with
				1604	\| None -> ()
				1605
				1606	(* ignore top-level semicolons. *)
				1607	\| Some (Token.Kwd ';') ->
				1608	Stream.junk stream;
				1609	main_loop the_fpm the_execution_engine stream
				1610
				1611	\| Some token ->
				1612	begin
				1613	try match token with
				1614	\| Token.Def ->
				1615	let e = Parser.parse_definition stream in
				1616	print_endline "parsed a function definition.";
				1617	dump_value (Codegen.codegen_func the_fpm e);
				1618	\| Token.Extern ->
				1619	let e = Parser.parse_extern stream in
				1620	print_endline "parsed an extern.";
				1621	dump_value (Codegen.codegen_proto e);
				1622	\| _ ->
				1623	(* Evaluate a top-level expression into an anonymous function. *)
				1624	let e = Parser.parse_toplevel stream in
				1625	print_endline "parsed a top-level expr";
				1626	let the_function = Codegen.codegen_func the_fpm e in
				1627	dump_value the_function;
				1628
				1629	(* JIT the function, returning a function pointer. *)
				1630	let result = ExecutionEngine.run_function the_function [\|\|]
				1631	the_execution_engine in
				1632
				1633	print_string "Evaluated to ";
				1634	print_float (GenericValue.as_float Codegen.double_type result);
				1635	print_newline ();
				1636	with Stream.Error s \| Codegen.Error s ->
				1637	(* Skip token for error recovery. *)
				1638	Stream.junk stream;
				1639	print_endline s;
				1640	end;
				1641	print_string "ready> "; flush stdout;
				1642	main_loop the_fpm the_execution_engine stream
				1643
				1644	toy.ml:
				1645	.. code-block:: ocaml
				1646
				1647	(*===----------------------------------------------------------------------===
				1648	* Main driver code.
				1649	===----------------------------------------------------------------------===)
				1650
				1651	open Llvm
				1652	open Llvm_executionengine
				1653	open Llvm_target
				1654	open Llvm_scalar_opts
				1655
				1656	let main () =
				1657	ignore (initialize_native_target ());
				1658
				1659	(* Install standard binary operators.
				1660	* 1 is the lowest precedence. *)
				1661	Hashtbl.add Parser.binop_precedence '=' 2;
				1662	Hashtbl.add Parser.binop_precedence '<' 10;
				1663	Hashtbl.add Parser.binop_precedence '+' 20;
				1664	Hashtbl.add Parser.binop_precedence '-' 20;
				1665	Hashtbl.add Parser.binop_precedence '' 40; ( highest. *)
				1666
				1667	(* Prime the first token. *)
				1668	print_string "ready> "; flush stdout;
				1669	let stream = Lexer.lex (Stream.of_channel stdin) in
				1670
				1671	(* Create the JIT. *)
				1672	let the_execution_engine = ExecutionEngine.create Codegen.the_module in
				1673	let the_fpm = PassManager.create_function Codegen.the_module in
				1674
				1675	(* Set up the optimizer pipeline. Start with registering info about how the
				1676	* target lays out data structures. *)
				1677	DataLayout.add (ExecutionEngine.target_data the_execution_engine) the_fpm;
				1678
				1679	(* Promote allocas to registers. *)
				1680	add_memory_to_register_promotion the_fpm;
				1681
				1682	(* Do simple "peephole" optimizations and bit-twiddling optzn. *)
				1683	add_instruction_combination the_fpm;
				1684
				1685	(* reassociate expressions. *)
				1686	add_reassociation the_fpm;
				1687
				1688	(* Eliminate Common SubExpressions. *)
				1689	add_gvn the_fpm;
				1690
				1691	(* Simplify the control flow graph (deleting unreachable blocks, etc). *)
				1692	add_cfg_simplification the_fpm;
				1693
				1694	ignore (PassManager.initialize the_fpm);
				1695
				1696	(* Run the main "interpreter loop" now. *)
				1697	Toplevel.main_loop the_fpm the_execution_engine stream;
				1698
				1699	(* Print out all the generated code. *)
				1700	dump_module Codegen.the_module
				1701	;;
				1702
				1703	main ()
				1704
				1705	bindings.c
				1706	.. code-block:: c
				1707
				1708	#include <stdio.h>
				1709
				1710	/* putchard - putchar that takes a double and returns 0. */
				1711	extern double putchard(double X) {
				1712	putchar((char)X);
				1713	return 0;
				1714	}
				1715
				1716	/* printd - printf that takes a double prints it as "%f\n", returning 0. */
				1717	extern double printd(double X) {
				1718	printf("%f\n", X);
				1719	return 0;
				1720	}
				1721
				1722	`Next: Conclusion and other useful LLVM tidbits <OCamlLangImpl8.html>`_
				1723