Blame - docs/tutorial/OCamlLangImpl4.html - fp2-dev/platform/external/llvm

blob: 4e267b80f5761e8a5e74c2a8c82830ad02b52d75 [file] [log] [blame]

Erick Tryzelaar	37c076b	2008-03-30 09:57:12 +0000	[diff] [blame]	1	<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01//EN"
				2	"http://www.w3.org/TR/html4/strict.dtd">
				3
				4	<html>
				5	<head>
				6	<title>Kaleidoscope: Adding JIT and Optimizer Support</title>
				7	<meta http-equiv="Content-Type" content="text/html; charset=utf-8">
				8	<meta name="author" content="Chris Lattner">
				9	<meta name="author" content="Erick Tryzelaar">
				10	<link rel="stylesheet" href="../llvm.css" type="text/css">
				11	</head>
				12
				13	<body>
				14
				15	<div class="doc_title">Kaleidoscope: Adding JIT and Optimizer Support</div>
				16
				17	<ul>
				18	<li><a href="index.html">Up to Tutorial Index</a></li>
				19	<li>Chapter 4
				20	<ol>
				21	<li><a href="#intro">Chapter 4 Introduction</a></li>
				22	<li><a href="#trivialconstfold">Trivial Constant Folding</a></li>
				23	<li><a href="#optimizerpasses">LLVM Optimization Passes</a></li>
				24	<li><a href="#jit">Adding a JIT Compiler</a></li>
				25	<li><a href="#code">Full Code Listing</a></li>
				26	</ol>
				27	</li>
				28	<li><a href="OCamlLangImpl5.html">Chapter 5</a>: Extending the Language: Control
				29	Flow</li>
				30	</ul>
				31
				32	<div class="doc_author">
				33	<p>
				34	Written by <a href="mailto:sabre@nondot.org">Chris Lattner</a>
				35	and <a href="mailto:idadesub@users.sourceforge.net">Erick Tryzelaar</a>
				36	</p>
				37	</div>
				38
				39	<!-- *********************************************************************** -->
				40	<div class="doc_section"><a name="intro">Chapter 4 Introduction</a></div>
				41	<!-- *********************************************************************** -->
				42
				43	<div class="doc_text">
				44
				45	<p>Welcome to Chapter 4 of the "<a href="index.html">Implementing a language
				46	with LLVM</a>" tutorial. Chapters 1-3 described the implementation of a simple
				47	language and added support for generating LLVM IR. This chapter describes
				48	two new techniques: adding optimizer support to your language, and adding JIT
				49	compiler support. These additions will demonstrate how to get nice, efficient code
				50	for the Kaleidoscope language.</p>
				51
				52	</div>
				53
				54	<!-- *********************************************************************** -->
				55	<div class="doc_section"><a name="trivialconstfold">Trivial Constant
				56	Folding</a></div>
				57	<!-- *********************************************************************** -->
				58
				59	<div class="doc_text">
				60
				61	<p><b>Note:</b> the ocaml bindings already use <tt>LLVMFoldingBuilder</tt>.<p>
				62
				63	<p>
				64	Our demonstration for Chapter 3 is elegant and easy to extend. Unfortunately,
				65	it does not produce wonderful code. For example, when compiling simple code,
				66	we don't get obvious optimizations:</p>
				67
				68	<div class="doc_code">
				69	<pre>
				70	ready> <b>def test(x) 1+2+x;</b>
				71	Read function definition:
				72	define double @test(double %x) {
				73	entry:
				74	%addtmp = add double 1.000000e+00, 2.000000e+00
				75	%addtmp1 = add double %addtmp, %x
				76	ret double %addtmp1
				77	}
				78	</pre>
				79	</div>
				80
				81	<p>This code is a very, very literal transcription of the AST built by parsing
				82	the input. As such, this transcription lacks optimizations like constant folding
				83	(we'd like to get "<tt>add x, 3.0</tt>" in the example above) as well as other
				84	more important optimizations. Constant folding, in particular, is a very common
				85	and very important optimization: so much so that many language implementors
				86	implement constant folding support in their AST representation.</p>
				87
				88	<p>With LLVM, you don't need this support in the AST. Since all calls to build
				89	LLVM IR go through the LLVM builder, it would be nice if the builder itself
				90	checked to see if there was a constant folding opportunity when you call it.
				91	If so, it could just do the constant fold and return the constant instead of
				92	creating an instruction. This is exactly what the <tt>LLVMFoldingBuilder</tt>
				93	class does.
				94
				95	<p>All we did was switch from <tt>LLVMBuilder</tt> to
				96	<tt>LLVMFoldingBuilder</tt>. Though we change no other code, we now have all of our
				97	instructions implicitly constant folded without us having to do anything
				98	about it. For example, the input above now compiles to:</p>
				99
				100	<div class="doc_code">
				101	<pre>
				102	ready> <b>def test(x) 1+2+x;</b>
				103	Read function definition:
				104	define double @test(double %x) {
				105	entry:
				106	%addtmp = add double 3.000000e+00, %x
				107	ret double %addtmp
				108	}
				109	</pre>
				110	</div>
				111
				112	<p>Well, that was easy :). In practice, we recommend always using
				113	<tt>LLVMFoldingBuilder</tt> when generating code like this. It has no
				114	"syntactic overhead" for its use (you don't have to uglify your compiler with
				115	constant checks everywhere) and it can dramatically reduce the amount of
				116	LLVM IR that is generated in some cases (particular for languages with a macro
				117	preprocessor or that use a lot of constants).</p>
				118
				119	<p>On the other hand, the <tt>LLVMFoldingBuilder</tt> is limited by the fact
				120	that it does all of its analysis inline with the code as it is built. If you
				121	take a slightly more complex example:</p>
				122
				123	<div class="doc_code">
				124	<pre>
				125	ready> <b>def test(x) (1+2+x)*(x+(1+2));</b>
				126	ready> Read function definition:
				127	define double @test(double %x) {
				128	entry:
				129	%addtmp = add double 3.000000e+00, %x
				130	%addtmp1 = add double %x, 3.000000e+00
				131	%multmp = mul double %addtmp, %addtmp1
				132	ret double %multmp
				133	}
				134	</pre>
				135	</div>
				136
				137	<p>In this case, the LHS and RHS of the multiplication are the same value. We'd
				138	really like to see this generate "<tt>tmp = x+3; result = tmp*tmp;</tt>" instead
				139	of computing "<tt>x*3</tt>" twice.</p>
				140
				141	<p>Unfortunately, no amount of local analysis will be able to detect and correct
				142	this. This requires two transformations: reassociation of expressions (to
				143	make the add's lexically identical) and Common Subexpression Elimination (CSE)
				144	to delete the redundant add instruction. Fortunately, LLVM provides a broad
				145	range of optimizations that you can use, in the form of "passes".</p>
				146
				147	</div>
				148
				149	<!-- *********************************************************************** -->
				150	<div class="doc_section"><a name="optimizerpasses">LLVM Optimization
				151	Passes</a></div>
				152	<!-- *********************************************************************** -->
				153
				154	<div class="doc_text">
				155
				156	<p>LLVM provides many optimization passes, which do many different sorts of
				157	things and have different tradeoffs. Unlike other systems, LLVM doesn't hold
				158	to the mistaken notion that one set of optimizations is right for all languages
				159	and for all situations. LLVM allows a compiler implementor to make complete
				160	decisions about what optimizations to use, in which order, and in what
				161	situation.</p>
				162
				163	<p>As a concrete example, LLVM supports both "whole module" passes, which look
				164	across as large of body of code as they can (often a whole file, but if run
				165	at link time, this can be a substantial portion of the whole program). It also
				166	supports and includes "per-function" passes which just operate on a single
				167	function at a time, without looking at other functions. For more information
				168	on passes and how they are run, see the <a href="../WritingAnLLVMPass.html">How
				169	to Write a Pass</a> document and the <a href="../Passes.html">List of LLVM
				170	Passes</a>.</p>
				171
				172	<p>For Kaleidoscope, we are currently generating functions on the fly, one at
				173	a time, as the user types them in. We aren't shooting for the ultimate
				174	optimization experience in this setting, but we also want to catch the easy and
				175	quick stuff where possible. As such, we will choose to run a few per-function
				176	optimizations as the user types the function in. If we wanted to make a "static
				177	Kaleidoscope compiler", we would use exactly the code we have now, except that
				178	we would defer running the optimizer until the entire file has been parsed.</p>
				179
				180	<p>In order to get per-function optimizations going, we need to set up a
				181	<a href="../WritingAnLLVMPass.html#passmanager">Llvm.PassManager</a> to hold and
				182	organize the LLVM optimizations that we want to run. Once we have that, we can
				183	add a set of optimizations to run. The code looks like this:</p>
				184
				185	<div class="doc_code">
				186	<pre>
				187	(* Create the JIT. *)
				188	let the_module_provider = ModuleProvider.create Codegen.the_module in
				189	let the_execution_engine = ExecutionEngine.create the_module_provider in
				190	let the_fpm = PassManager.create_function the_module_provider in
				191
				192	(* Set up the optimizer pipeline. Start with registering info about how the
				193	* target lays out data structures. *)
				194	TargetData.add (ExecutionEngine.target_data the_execution_engine) the_fpm;
				195
				196	(* Do simple "peephole" optimizations and bit-twiddling optzn. *)
				197	add_instruction_combining the_fpm;
				198
				199	(* reassociate expressions. *)
				200	add_reassociation the_fpm;
				201
				202	(* Eliminate Common SubExpressions. *)
				203	add_gvn the_fpm;
				204
				205	(* Simplify the control flow graph (deleting unreachable blocks, etc). *)
				206	add_cfg_simplification the_fpm;
				207
				208	(* Run the main "interpreter loop" now. *)
				209	Toplevel.main_loop the_fpm the_execution_engine stream;
				210	</pre>
				211	</div>
				212
				213	<p>This code defines two values, an <tt>Llvm.llmoduleprovider</tt> and a
				214	<tt>Llvm.PassManager.t</tt>. The former is basically a wrapper around our
				215	<tt>Llvm.llmodule</tt> that the <tt>Llvm.PassManager.t</tt> requires. It
				216	provides certain flexibility that we're not going to take advantage of here,
				217	so I won't dive into any details about it.</p>
				218
				219	<p>The meat of the matter here, is the definition of "<tt>the_fpm</tt>". It
				220	requires a pointer to the <tt>the_module</tt> (through the
				221	<tt>the_module_provider</tt>) to construct itself. Once it is set up, we use a
				222	series of "add" calls to add a bunch of LLVM passes. The first pass is
				223	basically boilerplate, it adds a pass so that later optimizations know how the
				224	data structures in the program are layed out. The
				225	"<tt>the_execution_engine</tt>" variable is related to the JIT, which we will
				226	get to in the next section.</p>
				227
				228	<p>In this case, we choose to add 4 optimization passes. The passes we chose
				229	here are a pretty standard set of "cleanup" optimizations that are useful for
				230	a wide variety of code. I won't delve into what they do but, believe me,
				231	they are a good starting place :).</p>
				232
				233	<p>Once the <tt>Llvm.PassManager.</tt> is set up, we need to make use of it.
				234	We do this by running it after our newly created function is constructed (in
				235	<tt>Codegen.codegen_func</tt>), but before it is returned to the client:</p>
				236
				237	<div class="doc_code">
				238	<pre>
				239	let codegen_func the_fpm = function
Erick Tryzelaar	35295ff	2008-03-31 08:44:50 +0000	[diff] [blame]	240	...
Erick Tryzelaar	37c076b	2008-03-30 09:57:12 +0000	[diff] [blame]	241	try
				242	let ret_val = codegen_expr body in
				243
				244	(* Finish off the function. *)
				245	let _ = build_ret ret_val builder in
				246
				247	(* Validate the generated code, checking for consistency. *)
				248	Llvm_analysis.assert_valid_function the_function;
				249
				250	(* Optimize the function. *)
				251	let _ = PassManager.run_function the_function the_fpm in
				252
				253	the_function
				254	</pre>
				255	</div>
				256
				257	<p>As you can see, this is pretty straightforward. The <tt>the_fpm</tt>
				258	optimizes and updates the LLVM Function* in place, improving (hopefully) its
				259	body. With this in place, we can try our test above again:</p>
				260
				261	<div class="doc_code">
				262	<pre>
				263	ready> <b>def test(x) (1+2+x)*(x+(1+2));</b>
				264	ready> Read function definition:
				265	define double @test(double %x) {
				266	entry:
				267	%addtmp = add double %x, 3.000000e+00
				268	%multmp = mul double %addtmp, %addtmp
				269	ret double %multmp
				270	}
				271	</pre>
				272	</div>
				273
				274	<p>As expected, we now get our nicely optimized code, saving a floating point
				275	add instruction from every execution of this function.</p>
				276
				277	<p>LLVM provides a wide variety of optimizations that can be used in certain
				278	circumstances. Some <a href="../Passes.html">documentation about the various
				279	passes</a> is available, but it isn't very complete. Another good source of
				280	ideas can come from looking at the passes that <tt>llvm-gcc</tt> or
				281	<tt>llvm-ld</tt> run to get started. The "<tt>opt</tt>" tool allows you to
				282	experiment with passes from the command line, so you can see if they do
				283	anything.</p>
				284
				285	<p>Now that we have reasonable code coming out of our front-end, lets talk about
				286	executing it!</p>
				287
				288	</div>
				289
				290	<!-- *********************************************************************** -->
				291	<div class="doc_section"><a name="jit">Adding a JIT Compiler</a></div>
				292	<!-- *********************************************************************** -->
				293
				294	<div class="doc_text">
				295
				296	<p>Code that is available in LLVM IR can have a wide variety of tools
				297	applied to it. For example, you can run optimizations on it (as we did above),
				298	you can dump it out in textual or binary forms, you can compile the code to an
				299	assembly file (.s) for some target, or you can JIT compile it. The nice thing
				300	about the LLVM IR representation is that it is the "common currency" between
				301	many different parts of the compiler.
				302	</p>
				303
				304	<p>In this section, we'll add JIT compiler support to our interpreter. The
				305	basic idea that we want for Kaleidoscope is to have the user enter function
				306	bodies as they do now, but immediately evaluate the top-level expressions they
				307	type in. For example, if they type in "1 + 2;", we should evaluate and print
				308	out 3. If they define a function, they should be able to call it from the
				309	command line.</p>
				310
				311	<p>In order to do this, we first declare and initialize the JIT. This is done
				312	by adding a global variable and a call in <tt>main</tt>:</p>
				313
				314	<div class="doc_code">
				315	<pre>
				316	...
				317	let main () =
				318	...
Erick Tryzelaar	35295ff	2008-03-31 08:44:50 +0000	[diff] [blame]	319	<b>(* Create the JIT. *)
Erick Tryzelaar	37c076b	2008-03-30 09:57:12 +0000	[diff] [blame]	320	let the_module_provider = ModuleProvider.create Codegen.the_module in
Erick Tryzelaar	35295ff	2008-03-31 08:44:50 +0000	[diff] [blame]	321	let the_execution_engine = ExecutionEngine.create the_module_provider in</b>
Erick Tryzelaar	37c076b	2008-03-30 09:57:12 +0000	[diff] [blame]	322	...
				323	</pre>
				324	</div>
				325
				326	<p>This creates an abstract "Execution Engine" which can be either a JIT
				327	compiler or the LLVM interpreter. LLVM will automatically pick a JIT compiler
				328	for you if one is available for your platform, otherwise it will fall back to
				329	the interpreter.</p>
				330
				331	<p>Once the <tt>Llvm_executionengine.ExecutionEngine.t</tt> is created, the JIT
				332	is ready to be used. There are a variety of APIs that are useful, but the
				333	simplest one is the "<tt>Llvm_executionengine.ExecutionEngine.run_function</tt>"
				334	function. This method JIT compiles the specified LLVM Function and returns a
				335	function pointer to the generated machine code. In our case, this means that we
				336	can change the code that parses a top-level expression to look like this:</p>
				337
				338	<div class="doc_code">
				339	<pre>
				340	(* Evaluate a top-level expression into an anonymous function. *)
				341	let e = Parser.parse_toplevel stream in
				342	print_endline "parsed a top-level expr";
				343	let the_function = Codegen.codegen_func the_fpm e in
				344	dump_value the_function;
				345
				346	(* JIT the function, returning a function pointer. *)
				347	let result = ExecutionEngine.run_function the_function [\|\|]
				348	the_execution_engine in
				349
				350	print_string "Evaluated to ";
				351	print_float (GenericValue.as_float double_type result);
				352	print_newline ();
				353	</pre>
				354	</div>
				355
				356	<p>Recall that we compile top-level expressions into a self-contained LLVM
				357	function that takes no arguments and returns the computed double. Because the
				358	LLVM JIT compiler matches the native platform ABI, this means that you can just
				359	cast the result pointer to a function pointer of that type and call it directly.
				360	This means, there is no difference between JIT compiled code and native machine
				361	code that is statically linked into your application.</p>
				362
				363	<p>With just these two changes, lets see how Kaleidoscope works now!</p>
				364
				365	<div class="doc_code">
				366	<pre>
				367	ready> <b>4+5;</b>
				368	define double @""() {
				369	entry:
				370	ret double 9.000000e+00
				371	}
				372
				373	<em>Evaluated to 9.000000</em>
				374	</pre>
				375	</div>
				376
				377	<p>Well this looks like it is basically working. The dump of the function
				378	shows the "no argument function that always returns double" that we synthesize
				379	for each top level expression that is typed in. This demonstrates very basic
				380	functionality, but can we do more?</p>
				381
				382	<div class="doc_code">
				383	<pre>
				384	ready> <b>def testfunc(x y) x + y*2; </b>
				385	Read function definition:
				386	define double @testfunc(double %x, double %y) {
				387	entry:
				388	%multmp = mul double %y, 2.000000e+00
				389	%addtmp = add double %multmp, %x
				390	ret double %addtmp
				391	}
				392
				393	ready> <b>testfunc(4, 10);</b>
				394	define double @""() {
				395	entry:
				396	%calltmp = call double @testfunc( double 4.000000e+00, double 1.000000e+01 )
				397	ret double %calltmp
				398	}
				399
				400	<em>Evaluated to 24.000000</em>
				401	</pre>
				402	</div>
				403
				404	<p>This illustrates that we can now call user code, but there is something a bit
				405	subtle going on here. Note that we only invoke the JIT on the anonymous
				406	functions that <em>call testfunc</em>, but we never invoked it on <em>testfunc
				407	</em>itself.</p>
				408
				409	<p>What actually happened here is that the anonymous function was JIT'd when
				410	requested. When the Kaleidoscope app calls through the function pointer that is
				411	returned, the anonymous function starts executing. It ends up making the call
				412	to the "testfunc" function, and ends up in a stub that invokes the JIT, lazily,
				413	on testfunc. Once the JIT finishes lazily compiling testfunc,
				414	it returns and the code re-executes the call.</p>
				415
				416	<p>In summary, the JIT will lazily JIT code, on the fly, as it is needed. The
				417	JIT provides a number of other more advanced interfaces for things like freeing
				418	allocated machine code, rejit'ing functions to update them, etc. However, even
				419	with this simple code, we get some surprisingly powerful capabilities - check
				420	this out (I removed the dump of the anonymous functions, you should get the idea
				421	by now :) :</p>
				422
				423	<div class="doc_code">
				424	<pre>
				425	ready> <b>extern sin(x);</b>
				426	Read extern:
				427	declare double @sin(double)
				428
				429	ready> <b>extern cos(x);</b>
				430	Read extern:
				431	declare double @cos(double)
				432
				433	ready> <b>sin(1.0);</b>
				434	<em>Evaluated to 0.841471</em>
				435
				436	ready> <b>def foo(x) sin(x)sin(x) + cos(x)cos(x);</b>
				437	Read function definition:
				438	define double @foo(double %x) {
				439	entry:
				440	%calltmp = call double @sin( double %x )
				441	%multmp = mul double %calltmp, %calltmp
				442	%calltmp2 = call double @cos( double %x )
				443	%multmp4 = mul double %calltmp2, %calltmp2
				444	%addtmp = add double %multmp, %multmp4
				445	ret double %addtmp
				446	}
				447
				448	ready> <b>foo(4.0);</b>
				449	<em>Evaluated to 1.000000</em>
				450	</pre>
				451	</div>
				452
				453	<p>Whoa, how does the JIT know about sin and cos? The answer is surprisingly
				454	simple: in this example, the JIT started execution of a function and got to a
				455	function call. It realized that the function was not yet JIT compiled and
				456	invoked the standard set of routines to resolve the function. In this case,
				457	there is no body defined for the function, so the JIT ended up calling
				458	"<tt>dlsym("sin")</tt>" on the Kaleidoscope process itself. Since
				459	"<tt>sin</tt>" is defined within the JIT's address space, it simply patches up
				460	calls in the module to call the libm version of <tt>sin</tt> directly.</p>
				461
				462	<p>The LLVM JIT provides a number of interfaces (look in the
				463	<tt>llvm_executionengine.mli</tt> file) for controlling how unknown functions
				464	get resolved. It allows you to establish explicit mappings between IR objects
				465	and addresses (useful for LLVM global variables that you want to map to static
				466	tables, for example), allows you to dynamically decide on the fly based on the
				467	function name, and even allows you to have the JIT abort itself if any lazy
				468	compilation is attempted.</p>
				469
				470	<p>One interesting application of this is that we can now extend the language
				471	by writing arbitrary C code to implement operations. For example, if we add:
				472	</p>
				473
				474	<div class="doc_code">
				475	<pre>
				476	/* putchard - putchar that takes a double and returns 0. */
				477	extern "C"
				478	double putchard(double X) {
				479	putchar((char)X);
				480	return 0;
				481	}
				482	</pre>
				483	</div>
				484
				485	<p>Now we can produce simple output to the console by using things like:
				486	"<tt>extern putchard(x); putchard(120);</tt>", which prints a lowercase 'x' on
				487	the console (120 is the ASCII code for 'x'). Similar code could be used to
				488	implement file I/O, console input, and many other capabilities in
				489	Kaleidoscope.</p>
				490
				491	<p>This completes the JIT and optimizer chapter of the Kaleidoscope tutorial. At
				492	this point, we can compile a non-Turing-complete programming language, optimize
				493	and JIT compile it in a user-driven way. Next up we'll look into <a
				494	href="OCamlLangImpl5.html">extending the language with control flow
				495	constructs</a>, tackling some interesting LLVM IR issues along the way.</p>
				496
				497	</div>
				498
				499	<!-- *********************************************************************** -->
				500	<div class="doc_section"><a name="code">Full Code Listing</a></div>
				501	<!-- *********************************************************************** -->
				502
				503	<div class="doc_text">
				504
				505	<p>
				506	Here is the complete code listing for our running example, enhanced with the
				507	LLVM JIT and optimizer. To build this example, use:
				508	</p>
				509
Erick Tryzelaar	35295ff	2008-03-31 08:44:50 +0000	[diff] [blame]	510	<div class="doc_code">
				511	<pre>
				512	# Compile
				513	ocamlbuild toy.byte
				514	# Run
				515	./toy.byte
				516	</pre>
				517	</div>
				518
				519	<p>Here is the code:</p>
				520
Erick Tryzelaar	37c076b	2008-03-30 09:57:12 +0000	[diff] [blame]	521	<dl>
				522	<dt>_tags:</dt>
				523	<dd class="doc_code">
				524	<pre>
				525	<{lexer,parser}.ml>: use_camlp4, pp(camlp4of)
				526	<*.{byte,native}>: g++, use_llvm, use_llvm_analysis
				527	<*.{byte,native}>: use_llvm_executionengine, use_llvm_target
				528	<*.{byte,native}>: use_llvm_scalar_opts, use_bindings
				529	</pre>
				530	</dd>
				531
				532	<dt>myocamlbuild.ml:</dt>
				533	<dd class="doc_code">
				534	<pre>
				535	open Ocamlbuild_plugin;;
				536
				537	ocaml_lib ~extern:true "llvm";;
				538	ocaml_lib ~extern:true "llvm_analysis";;
				539	ocaml_lib ~extern:true "llvm_executionengine";;
				540	ocaml_lib ~extern:true "llvm_target";;
				541	ocaml_lib ~extern:true "llvm_scalar_opts";;
				542
				543	flag ["link"; "ocaml"; "g++"] (S[A"-cc"; A"g++"]);;
				544	dep ["link"; "ocaml"; "use_bindings"] ["bindings.o"];;
				545	</pre>
				546	</dd>
				547
				548	<dt>token.ml:</dt>
				549	<dd class="doc_code">
				550	<pre>
				551	(*===----------------------------------------------------------------------===
				552	* Lexer Tokens
				553	===----------------------------------------------------------------------===)
				554
				555	(* The lexer returns these 'Kwd' if it is an unknown character, otherwise one of
				556	* these others for known things. *)
				557	type token =
				558	(* commands *)
				559	\| Def \| Extern
				560
				561	(* primary *)
				562	\| Ident of string \| Number of float
				563
				564	(* unknown *)
				565	\| Kwd of char
				566	</pre>
				567	</dd>
				568
				569	<dt>lexer.ml:</dt>
				570	<dd class="doc_code">
				571	<pre>
				572	(*===----------------------------------------------------------------------===
				573	* Lexer
				574	===----------------------------------------------------------------------===)
				575
				576	let rec lex = parser
				577	(* Skip any whitespace. *)
				578	\| [< ' (' ' \| '\n' \| '\r' \| '\t'); stream >] -> lex stream
				579
				580	(* identifier: [a-zA-Z][a-zA-Z0-9] *)
				581	\| [< ' ('A' .. 'Z' \| 'a' .. 'z' as c); stream >] ->
				582	let buffer = Buffer.create 1 in
				583	Buffer.add_char buffer c;
				584	lex_ident buffer stream
				585
				586	(* number: [0-9.]+ *)
				587	\| [< ' ('0' .. '9' as c); stream >] ->
				588	let buffer = Buffer.create 1 in
				589	Buffer.add_char buffer c;
				590	lex_number buffer stream
				591
				592	(* Comment until end of line. *)
				593	\| [< ' ('#'); stream >] ->
				594	lex_comment stream
				595
				596	(* Otherwise, just return the character as its ascii value. *)
				597	\| [< 'c; stream >] ->
				598	[< 'Token.Kwd c; lex stream >]
				599
				600	(* end of stream. *)
				601	\| [< >] -> [< >]
				602
				603	and lex_number buffer = parser
				604	\| [< ' ('0' .. '9' \| '.' as c); stream >] ->
				605	Buffer.add_char buffer c;
				606	lex_number buffer stream
				607	\| [< stream=lex >] ->
				608	[< 'Token.Number (float_of_string (Buffer.contents buffer)); stream >]
				609
				610	and lex_ident buffer = parser
				611	\| [< ' ('A' .. 'Z' \| 'a' .. 'z' \| '0' .. '9' as c); stream >] ->
				612	Buffer.add_char buffer c;
				613	lex_ident buffer stream
				614	\| [< stream=lex >] ->
				615	match Buffer.contents buffer with
				616	\| "def" -> [< 'Token.Def; stream >]
				617	\| "extern" -> [< 'Token.Extern; stream >]
				618	\| id -> [< 'Token.Ident id; stream >]
				619
				620	and lex_comment = parser
				621	\| [< ' ('\n'); stream=lex >] -> stream
				622	\| [< 'c; e=lex_comment >] -> e
				623	\| [< >] -> [< >]
				624	</pre>
				625	</dd>
				626
				627	<dt>ast.ml:</dt>
				628	<dd class="doc_code">
				629	<pre>
				630	(*===----------------------------------------------------------------------===
				631	* Abstract Syntax Tree (aka Parse Tree)
				632	===----------------------------------------------------------------------===)
				633
				634	(* expr - Base type for all expression nodes. *)
				635	type expr =
				636	(* variant for numeric literals like "1.0". *)
				637	\| Number of float
				638
				639	(* variant for referencing a variable, like "a". *)
				640	\| Variable of string
				641
				642	(* variant for a binary operator. *)
				643	\| Binary of char * expr * expr
				644
				645	(* variant for function calls. *)
				646	\| Call of string * expr array
				647
				648	(* proto - This type represents the "prototype" for a function, which captures
				649	* its name, and its argument names (thus implicitly the number of arguments the
				650	* function takes). *)
				651	type proto = Prototype of string * string array
				652
				653	(* func - This type represents a function definition itself. *)
				654	type func = Function of proto * expr
				655	</pre>
				656	</dd>
				657
				658	<dt>parser.ml:</dt>
				659	<dd class="doc_code">
				660	<pre>
				661	(*===---------------------------------------------------------------------===
				662	* Parser
				663	===---------------------------------------------------------------------===)
				664
				665	(* binop_precedence - This holds the precedence for each binary operator that is
				666	* defined *)
				667	let binop_precedence:(char, int) Hashtbl.t = Hashtbl.create 10
				668
				669	(* precedence - Get the precedence of the pending binary operator token. *)
				670	let precedence c = try Hashtbl.find binop_precedence c with Not_found -> -1
				671
				672	(* primary
				673	* ::= identifier
				674	* ::= numberexpr
				675	* ::= parenexpr *)
				676	let rec parse_primary = parser
				677	(* numberexpr ::= number *)
				678	\| [< 'Token.Number n >] -> Ast.Number n
				679
				680	(* parenexpr ::= '(' expression ')' *)
				681	\| [< 'Token.Kwd '('; e=parse_expr; 'Token.Kwd ')' ?? "expected ')'" >] -> e
				682
				683	(* identifierexpr
				684	* ::= identifier
				685	* ::= identifier '(' argumentexpr ')' *)
				686	\| [< 'Token.Ident id; stream >] ->
				687	let rec parse_args accumulator = parser
				688	\| [< e=parse_expr; stream >] ->
				689	begin parser
				690	\| [< 'Token.Kwd ','; e=parse_args (e :: accumulator) >] -> e
				691	\| [< >] -> e :: accumulator
				692	end stream
				693	\| [< >] -> accumulator
				694	in
				695	let rec parse_ident id = parser
				696	(* Call. *)
				697	\| [< 'Token.Kwd '(';
				698	args=parse_args [];
				699	'Token.Kwd ')' ?? "expected ')'">] ->
				700	Ast.Call (id, Array.of_list (List.rev args))
				701
				702	(* Simple variable ref. *)
				703	\| [< >] -> Ast.Variable id
				704	in
				705	parse_ident id stream
				706
				707	\| [< >] -> raise (Stream.Error "unknown token when expecting an expression.")
				708
				709	(* binoprhs
				710	* ::= ('+' primary)* *)
				711	and parse_bin_rhs expr_prec lhs stream =
				712	match Stream.peek stream with
				713	(* If this is a binop, find its precedence. *)
				714	\| Some (Token.Kwd c) when Hashtbl.mem binop_precedence c ->
				715	let token_prec = precedence c in
				716
				717	(* If this is a binop that binds at least as tightly as the current binop,
				718	* consume it, otherwise we are done. *)
				719	if token_prec < expr_prec then lhs else begin
				720	(* Eat the binop. *)
				721	Stream.junk stream;
				722
				723	(* Parse the primary expression after the binary operator. *)
				724	let rhs = parse_primary stream in
				725
				726	(* Okay, we know this is a binop. *)
				727	let rhs =
				728	match Stream.peek stream with
				729	\| Some (Token.Kwd c2) ->
				730	(* If BinOp binds less tightly with rhs than the operator after
				731	* rhs, let the pending operator take rhs as its lhs. *)
				732	let next_prec = precedence c2 in
				733	if token_prec < next_prec
				734	then parse_bin_rhs (token_prec + 1) rhs stream
				735	else rhs
				736	\| _ -> rhs
				737	in
				738
				739	(* Merge lhs/rhs. *)
				740	let lhs = Ast.Binary (c, lhs, rhs) in
				741	parse_bin_rhs expr_prec lhs stream
				742	end
				743	\| _ -> lhs
				744
				745	(* expression
				746	* ::= primary binoprhs *)
				747	and parse_expr = parser
				748	\| [< lhs=parse_primary; stream >] -> parse_bin_rhs 0 lhs stream
				749
				750	(* prototype
				751	* ::= id '(' id* ')' *)
				752	let parse_prototype =
				753	let rec parse_args accumulator = parser
				754	\| [< 'Token.Ident id; e=parse_args (id::accumulator) >] -> e
				755	\| [< >] -> accumulator
				756	in
				757
				758	parser
				759	\| [< 'Token.Ident id;
				760	'Token.Kwd '(' ?? "expected '(' in prototype";
				761	args=parse_args [];
				762	'Token.Kwd ')' ?? "expected ')' in prototype" >] ->
				763	(* success. *)
				764	Ast.Prototype (id, Array.of_list (List.rev args))
				765
				766	\| [< >] ->
				767	raise (Stream.Error "expected function name in prototype")
				768
				769	(* definition ::= 'def' prototype expression *)
				770	let parse_definition = parser
				771	\| [< 'Token.Def; p=parse_prototype; e=parse_expr >] ->
				772	Ast.Function (p, e)
				773
				774	(* toplevelexpr ::= expression *)
				775	let parse_toplevel = parser
				776	\| [< e=parse_expr >] ->
				777	(* Make an anonymous proto. *)
				778	Ast.Function (Ast.Prototype ("", [\|\|]), e)
				779
				780	(* external ::= 'extern' prototype *)
				781	let parse_extern = parser
				782	\| [< 'Token.Extern; e=parse_prototype >] -> e
				783	</pre>
				784	</dd>
				785
				786	<dt>codegen.ml:</dt>
				787	<dd class="doc_code">
				788	<pre>
				789	(*===----------------------------------------------------------------------===
				790	* Code Generation
				791	===----------------------------------------------------------------------===)
				792
				793	open Llvm
				794
				795	exception Error of string
				796
				797	let the_module = create_module "my cool jit"
				798	let builder = builder ()
				799	let named_values:(string, llvalue) Hashtbl.t = Hashtbl.create 10
				800
				801	let rec codegen_expr = function
				802	\| Ast.Number n -> const_float double_type n
				803	\| Ast.Variable name ->
				804	(try Hashtbl.find named_values name with
				805	\| Not_found -> raise (Error "unknown variable name"))
				806	\| Ast.Binary (op, lhs, rhs) ->
				807	let lhs_val = codegen_expr lhs in
				808	let rhs_val = codegen_expr rhs in
				809	begin
				810	match op with
				811	\| '+' -> build_add lhs_val rhs_val "addtmp" builder
				812	\| '-' -> build_sub lhs_val rhs_val "subtmp" builder
				813	\| '*' -> build_mul lhs_val rhs_val "multmp" builder
				814	\| '<' ->
				815	(* Convert bool 0/1 to double 0.0 or 1.0 *)
				816	let i = build_fcmp Fcmp.Ult lhs_val rhs_val "cmptmp" builder in
				817	build_uitofp i double_type "booltmp" builder
				818	\| _ -> raise (Error "invalid binary operator")
				819	end
				820	\| Ast.Call (callee, args) ->
				821	(* Look up the name in the module table. *)
				822	let callee =
				823	match lookup_function callee the_module with
				824	\| Some callee -> callee
				825	\| None -> raise (Error "unknown function referenced")
				826	in
				827	let params = params callee in
				828
				829	(* If argument mismatch error. *)
				830	if Array.length params == Array.length args then () else
				831	raise (Error "incorrect # arguments passed");
				832	let args = Array.map codegen_expr args in
				833	build_call callee args "calltmp" builder
				834
				835	let codegen_proto = function
				836	\| Ast.Prototype (name, args) ->
				837	(* Make the function type: double(double,double) etc. *)
				838	let doubles = Array.make (Array.length args) double_type in
				839	let ft = function_type double_type doubles in
				840	let f =
				841	match lookup_function name the_module with
				842	\| None -> declare_function name ft the_module
				843
				844	(* If 'f' conflicted, there was already something named 'name'. If it
				845	* has a body, don't allow redefinition or reextern. *)
				846	\| Some f ->
				847	(* If 'f' already has a body, reject this. *)
				848	if block_begin f <> At_end f then
				849	raise (Error "redefinition of function");
				850
				851	(* If 'f' took a different number of arguments, reject. *)
				852	if element_type (type_of f) <> ft then
				853	raise (Error "redefinition of function with different # args");
				854	f
				855	in
				856
				857	(* Set names for all arguments. *)
				858	Array.iteri (fun i a ->
				859	let n = args.(i) in
				860	set_value_name n a;
				861	Hashtbl.add named_values n a;
				862	) (params f);
				863	f
				864
				865	let codegen_func the_fpm = function
				866	\| Ast.Function (proto, body) ->
				867	Hashtbl.clear named_values;
				868	let the_function = codegen_proto proto in
				869
				870	(* Create a new basic block to start insertion into. *)
				871	let bb = append_block "entry" the_function in
				872	position_at_end bb builder;
				873
				874	try
				875	let ret_val = codegen_expr body in
				876
				877	(* Finish off the function. *)
				878	let _ = build_ret ret_val builder in
				879
				880	(* Validate the generated code, checking for consistency. *)
				881	Llvm_analysis.assert_valid_function the_function;
				882
				883	(* Optimize the function. *)
				884	let _ = PassManager.run_function the_function the_fpm in
				885
				886	the_function
				887	with e ->
				888	delete_function the_function;
				889	raise e
				890	</pre>
				891	</dd>
				892
				893	<dt>toplevel.ml:</dt>
				894	<dd class="doc_code">
				895	<pre>
				896	(*===----------------------------------------------------------------------===
				897	* Top-Level parsing and JIT Driver
				898	===----------------------------------------------------------------------===)
				899
				900	open Llvm
				901	open Llvm_executionengine
				902
				903	(* top ::= definition \| external \| expression \| ';' *)
				904	let rec main_loop the_fpm the_execution_engine stream =
				905	match Stream.peek stream with
				906	\| None -> ()
				907
				908	(* ignore top-level semicolons. *)
				909	\| Some (Token.Kwd ';') ->
				910	Stream.junk stream;
				911	main_loop the_fpm the_execution_engine stream
				912
				913	\| Some token ->
				914	begin
				915	try match token with
				916	\| Token.Def ->
				917	let e = Parser.parse_definition stream in
				918	print_endline "parsed a function definition.";
				919	dump_value (Codegen.codegen_func the_fpm e);
				920	\| Token.Extern ->
				921	let e = Parser.parse_extern stream in
				922	print_endline "parsed an extern.";
				923	dump_value (Codegen.codegen_proto e);
				924	\| _ ->
				925	(* Evaluate a top-level expression into an anonymous function. *)
				926	let e = Parser.parse_toplevel stream in
				927	print_endline "parsed a top-level expr";
				928	let the_function = Codegen.codegen_func the_fpm e in
				929	dump_value the_function;
				930
				931	(* JIT the function, returning a function pointer. *)
				932	let result = ExecutionEngine.run_function the_function [\|\|]
				933	the_execution_engine in
				934
				935	print_string "Evaluated to ";
				936	print_float (GenericValue.as_float double_type result);
				937	print_newline ();
				938	with Stream.Error s \| Codegen.Error s ->
				939	(* Skip token for error recovery. *)
				940	Stream.junk stream;
				941	print_endline s;
				942	end;
				943	print_string "ready> "; flush stdout;
				944	main_loop the_fpm the_execution_engine stream
				945	</pre>
				946	</dd>
				947
				948	<dt>toy.ml:</dt>
				949	<dd class="doc_code">
				950	<pre>
				951	(*===----------------------------------------------------------------------===
				952	* Main driver code.
				953	===----------------------------------------------------------------------===)
				954
				955	open Llvm
				956	open Llvm_executionengine
				957	open Llvm_target
				958	open Llvm_scalar_opts
				959
				960	let main () =
				961	(* Install standard binary operators.
				962	* 1 is the lowest precedence. *)
				963	Hashtbl.add Parser.binop_precedence '<' 10;
				964	Hashtbl.add Parser.binop_precedence '+' 20;
				965	Hashtbl.add Parser.binop_precedence '-' 20;
				966	Hashtbl.add Parser.binop_precedence '' 40; ( highest. *)
				967
				968	(* Prime the first token. *)
				969	print_string "ready> "; flush stdout;
				970	let stream = Lexer.lex (Stream.of_channel stdin) in
				971
				972	(* Create the JIT. *)
				973	let the_module_provider = ModuleProvider.create Codegen.the_module in
				974	let the_execution_engine = ExecutionEngine.create the_module_provider in
				975	let the_fpm = PassManager.create_function the_module_provider in
				976
				977	(* Set up the optimizer pipeline. Start with registering info about how the
				978	* target lays out data structures. *)
				979	TargetData.add (ExecutionEngine.target_data the_execution_engine) the_fpm;
				980
				981	(* Do simple "peephole" optimizations and bit-twiddling optzn. *)
				982	add_instruction_combining the_fpm;
				983
				984	(* reassociate expressions. *)
				985	add_reassociation the_fpm;
				986
				987	(* Eliminate Common SubExpressions. *)
				988	add_gvn the_fpm;
				989
				990	(* Simplify the control flow graph (deleting unreachable blocks, etc). *)
				991	add_cfg_simplification the_fpm;
				992
				993	(* Run the main "interpreter loop" now. *)
				994	Toplevel.main_loop the_fpm the_execution_engine stream;
				995
				996	(* Print out all the generated code. *)
				997	dump_module Codegen.the_module
				998	;;
				999
				1000	main ()
				1001	</pre>
				1002	</dd>
				1003
				1004	<dt>bindings.c</dt>
				1005	<dd class="doc_code">
				1006	<pre>
				1007	#include <stdio.h>
				1008
				1009	/* putchard - putchar that takes a double and returns 0. */
				1010	extern double putchard(double X) {
				1011	putchar((char)X);
				1012	return 0;
				1013	}
				1014	</pre>
				1015	</dd>
				1016	</dl>
				1017
				1018	<a href="OCamlLangImpl5.html">Next: Extending the language: control flow</a>
				1019	</div>
				1020
				1021	<!-- *********************************************************************** -->
				1022	<hr>
				1023	<address>
				1024	<a href="http://jigsaw.w3.org/css-validator/check/referer"><img
				1025	src="http://jigsaw.w3.org/css-validator/images/vcss" alt="Valid CSS!"></a>
				1026	<a href="http://validator.w3.org/check/referer"><img
				1027	src="http://www.w3.org/Icons/valid-html401" alt="Valid HTML 4.01!"></a>
				1028
				1029	<a href="mailto:sabre@nondot.org">Chris Lattner</a><br>
				1030	<a href="mailto:idadesub@users.sourceforge.net">Erick Tryzelaar</a><br>
				1031	<a href="http://llvm.org">The LLVM Compiler Infrastructure</a><br>
				1032	Last modified: $Date: 2007-10-17 11:05:13 -0700 (Wed, 17 Oct 2007) $
				1033	</address>
				1034	</body>
				1035	</html>