Blame - docs/tutorial/OCamlLangImpl4.rst - fp2-dev/platform/external/llvm

blob: 865a03dfb781021e5db882d62110500e40f7b47a [file] [log] [blame]

Sean Silva	ee47edf	2012-12-05 00:26:32 +0000	[diff] [blame]	1	==============================================
				2	Kaleidoscope: Adding JIT and Optimizer Support
				3	==============================================
				4
				5	.. contents::
				6	:local:
				7
				8	Written by `Chris Lattner <mailto:sabre@nondot.org>`_ and `Erick
				9	Tryzelaar <mailto:idadesub@users.sourceforge.net>`_
				10
				11	Chapter 4 Introduction
				12	======================
				13
				14	Welcome to Chapter 4 of the "`Implementing a language with
				15	LLVM <index.html>`_" tutorial. Chapters 1-3 described the implementation
				16	of a simple language and added support for generating LLVM IR. This
				17	chapter describes two new techniques: adding optimizer support to your
				18	language, and adding JIT compiler support. These additions will
				19	demonstrate how to get nice, efficient code for the Kaleidoscope
				20	language.
				21
				22	Trivial Constant Folding
				23	========================
				24
				25	Note: the default ``IRBuilder`` now always includes the constant
				26	folding optimisations below.
				27
				28	Our demonstration for Chapter 3 is elegant and easy to extend.
				29	Unfortunately, it does not produce wonderful code. For example, when
				30	compiling simple code, we don't get obvious optimizations:
				31
				32	::
				33
				34	ready> def test(x) 1+2+x;
				35	Read function definition:
				36	define double @test(double %x) {
				37	entry:
				38	%addtmp = fadd double 1.000000e+00, 2.000000e+00
				39	%addtmp1 = fadd double %addtmp, %x
				40	ret double %addtmp1
				41	}
				42
				43	This code is a very, very literal transcription of the AST built by
				44	parsing the input. As such, this transcription lacks optimizations like
				45	constant folding (we'd like to get "``add x, 3.0``" in the example
				46	above) as well as other more important optimizations. Constant folding,
				47	in particular, is a very common and very important optimization: so much
				48	so that many language implementors implement constant folding support in
				49	their AST representation.
				50
				51	With LLVM, you don't need this support in the AST. Since all calls to
				52	build LLVM IR go through the LLVM builder, it would be nice if the
				53	builder itself checked to see if there was a constant folding
				54	opportunity when you call it. If so, it could just do the constant fold
				55	and return the constant instead of creating an instruction. This is
				56	exactly what the ``LLVMFoldingBuilder`` class does.
				57
				58	All we did was switch from ``LLVMBuilder`` to ``LLVMFoldingBuilder``.
				59	Though we change no other code, we now have all of our instructions
				60	implicitly constant folded without us having to do anything about it.
				61	For example, the input above now compiles to:
				62
				63	::
				64
				65	ready> def test(x) 1+2+x;
				66	Read function definition:
				67	define double @test(double %x) {
				68	entry:
				69	%addtmp = fadd double 3.000000e+00, %x
				70	ret double %addtmp
				71	}
				72
				73	Well, that was easy :). In practice, we recommend always using
				74	``LLVMFoldingBuilder`` when generating code like this. It has no
				75	"syntactic overhead" for its use (you don't have to uglify your compiler
				76	with constant checks everywhere) and it can dramatically reduce the
				77	amount of LLVM IR that is generated in some cases (particular for
				78	languages with a macro preprocessor or that use a lot of constants).
				79
				80	On the other hand, the ``LLVMFoldingBuilder`` is limited by the fact
				81	that it does all of its analysis inline with the code as it is built. If
				82	you take a slightly more complex example:
				83
				84	::
				85
				86	ready> def test(x) (1+2+x)*(x+(1+2));
				87	ready> Read function definition:
				88	define double @test(double %x) {
				89	entry:
				90	%addtmp = fadd double 3.000000e+00, %x
				91	%addtmp1 = fadd double %x, 3.000000e+00
				92	%multmp = fmul double %addtmp, %addtmp1
				93	ret double %multmp
				94	}
				95
				96	In this case, the LHS and RHS of the multiplication are the same value.
				97	We'd really like to see this generate "``tmp = x+3; result = tmp*tmp;``"
				98	instead of computing "``x*3``" twice.
				99
				100	Unfortunately, no amount of local analysis will be able to detect and
				101	correct this. This requires two transformations: reassociation of
				102	expressions (to make the add's lexically identical) and Common
				103	Subexpression Elimination (CSE) to delete the redundant add instruction.
				104	Fortunately, LLVM provides a broad range of optimizations that you can
				105	use, in the form of "passes".
				106
				107	LLVM Optimization Passes
				108	========================
				109
				110	LLVM provides many optimization passes, which do many different sorts of
				111	things and have different tradeoffs. Unlike other systems, LLVM doesn't
				112	hold to the mistaken notion that one set of optimizations is right for
				113	all languages and for all situations. LLVM allows a compiler implementor
				114	to make complete decisions about what optimizations to use, in which
				115	order, and in what situation.
				116
				117	As a concrete example, LLVM supports both "whole module" passes, which
				118	look across as large of body of code as they can (often a whole file,
				119	but if run at link time, this can be a substantial portion of the whole
				120	program). It also supports and includes "per-function" passes which just
				121	operate on a single function at a time, without looking at other
				122	functions. For more information on passes and how they are run, see the
				123	`How to Write a Pass <../WritingAnLLVMPass.html>`_ document and the
				124	`List of LLVM Passes <../Passes.html>`_.
				125
				126	For Kaleidoscope, we are currently generating functions on the fly, one
				127	at a time, as the user types them in. We aren't shooting for the
				128	ultimate optimization experience in this setting, but we also want to
				129	catch the easy and quick stuff where possible. As such, we will choose
				130	to run a few per-function optimizations as the user types the function
				131	in. If we wanted to make a "static Kaleidoscope compiler", we would use
				132	exactly the code we have now, except that we would defer running the
				133	optimizer until the entire file has been parsed.
				134
				135	In order to get per-function optimizations going, we need to set up a
				136	`Llvm.PassManager <../WritingAnLLVMPass.html#passmanager>`_ to hold and
				137	organize the LLVM optimizations that we want to run. Once we have that,
				138	we can add a set of optimizations to run. The code looks like this:
				139
				140	.. code-block:: ocaml
				141
				142	(* Create the JIT. *)
				143	let the_execution_engine = ExecutionEngine.create Codegen.the_module in
				144	let the_fpm = PassManager.create_function Codegen.the_module in
				145
				146	(* Set up the optimizer pipeline. Start with registering info about how the
				147	* target lays out data structures. *)
				148	DataLayout.add (ExecutionEngine.target_data the_execution_engine) the_fpm;
				149
				150	(* Do simple "peephole" optimizations and bit-twiddling optzn. *)
				151	add_instruction_combining the_fpm;
				152
				153	(* reassociate expressions. *)
				154	add_reassociation the_fpm;
				155
				156	(* Eliminate Common SubExpressions. *)
				157	add_gvn the_fpm;
				158
				159	(* Simplify the control flow graph (deleting unreachable blocks, etc). *)
				160	add_cfg_simplification the_fpm;
				161
				162	ignore (PassManager.initialize the_fpm);
				163
				164	(* Run the main "interpreter loop" now. *)
				165	Toplevel.main_loop the_fpm the_execution_engine stream;
				166
				167	The meat of the matter here, is the definition of "``the_fpm``". It
				168	requires a pointer to the ``the_module`` to construct itself. Once it is
				169	set up, we use a series of "add" calls to add a bunch of LLVM passes.
				170	The first pass is basically boilerplate, it adds a pass so that later
				171	optimizations know how the data structures in the program are laid out.
				172	The "``the_execution_engine``" variable is related to the JIT, which we
				173	will get to in the next section.
				174
				175	In this case, we choose to add 4 optimization passes. The passes we
				176	chose here are a pretty standard set of "cleanup" optimizations that are
				177	useful for a wide variety of code. I won't delve into what they do but,
				178	believe me, they are a good starting place :).
				179
				180	Once the ``Llvm.PassManager.`` is set up, we need to make use of it. We
				181	do this by running it after our newly created function is constructed
				182	(in ``Codegen.codegen_func``), but before it is returned to the client:
				183
				184	.. code-block:: ocaml
				185
				186	let codegen_func the_fpm = function
				187	...
				188	try
				189	let ret_val = codegen_expr body in
				190
				191	(* Finish off the function. *)
				192	let _ = build_ret ret_val builder in
				193
				194	(* Validate the generated code, checking for consistency. *)
				195	Llvm_analysis.assert_valid_function the_function;
				196
				197	(* Optimize the function. *)
				198	let _ = PassManager.run_function the_function the_fpm in
				199
				200	the_function
				201
				202	As you can see, this is pretty straightforward. The ``the_fpm``
				203	optimizes and updates the LLVM Function\* in place, improving
				204	(hopefully) its body. With this in place, we can try our test above
				205	again:
				206
				207	::
				208
				209	ready> def test(x) (1+2+x)*(x+(1+2));
				210	ready> Read function definition:
				211	define double @test(double %x) {
				212	entry:
				213	%addtmp = fadd double %x, 3.000000e+00
				214	%multmp = fmul double %addtmp, %addtmp
				215	ret double %multmp
				216	}
				217
				218	As expected, we now get our nicely optimized code, saving a floating
				219	point add instruction from every execution of this function.
				220
				221	LLVM provides a wide variety of optimizations that can be used in
				222	certain circumstances. Some `documentation about the various
				223	passes <../Passes.html>`_ is available, but it isn't very complete.
				224	Another good source of ideas can come from looking at the passes that
				225	``Clang`` runs to get started. The "``opt``" tool allows you to
				226	experiment with passes from the command line, so you can see if they do
				227	anything.
				228
				229	Now that we have reasonable code coming out of our front-end, lets talk
				230	about executing it!
				231
				232	Adding a JIT Compiler
				233	=====================
				234
				235	Code that is available in LLVM IR can have a wide variety of tools
				236	applied to it. For example, you can run optimizations on it (as we did
				237	above), you can dump it out in textual or binary forms, you can compile
				238	the code to an assembly file (.s) for some target, or you can JIT
				239	compile it. The nice thing about the LLVM IR representation is that it
				240	is the "common currency" between many different parts of the compiler.
				241
				242	In this section, we'll add JIT compiler support to our interpreter. The
				243	basic idea that we want for Kaleidoscope is to have the user enter
				244	function bodies as they do now, but immediately evaluate the top-level
				245	expressions they type in. For example, if they type in "1 + 2;", we
				246	should evaluate and print out 3. If they define a function, they should
				247	be able to call it from the command line.
				248
				249	In order to do this, we first declare and initialize the JIT. This is
				250	done by adding a global variable and a call in ``main``:
				251
				252	.. code-block:: ocaml
				253
				254	...
				255	let main () =
				256	...
				257	(* Create the JIT. *)
				258	let the_execution_engine = ExecutionEngine.create Codegen.the_module in
				259	...
				260
				261	This creates an abstract "Execution Engine" which can be either a JIT
				262	compiler or the LLVM interpreter. LLVM will automatically pick a JIT
				263	compiler for you if one is available for your platform, otherwise it
				264	will fall back to the interpreter.
				265
				266	Once the ``Llvm_executionengine.ExecutionEngine.t`` is created, the JIT
				267	is ready to be used. There are a variety of APIs that are useful, but
				268	the simplest one is the
				269	"``Llvm_executionengine.ExecutionEngine.run_function``" function. This
				270	method JIT compiles the specified LLVM Function and returns a function
				271	pointer to the generated machine code. In our case, this means that we
				272	can change the code that parses a top-level expression to look like
				273	this:
				274
				275	.. code-block:: ocaml
				276
				277	(* Evaluate a top-level expression into an anonymous function. *)
				278	let e = Parser.parse_toplevel stream in
				279	print_endline "parsed a top-level expr";
				280	let the_function = Codegen.codegen_func the_fpm e in
				281	dump_value the_function;
				282
				283	(* JIT the function, returning a function pointer. *)
				284	let result = ExecutionEngine.run_function the_function [\|\|]
				285	the_execution_engine in
				286
				287	print_string "Evaluated to ";
				288	print_float (GenericValue.as_float Codegen.double_type result);
				289	print_newline ();
				290
				291	Recall that we compile top-level expressions into a self-contained LLVM
				292	function that takes no arguments and returns the computed double.
				293	Because the LLVM JIT compiler matches the native platform ABI, this
				294	means that you can just cast the result pointer to a function pointer of
				295	that type and call it directly. This means, there is no difference
				296	between JIT compiled code and native machine code that is statically
				297	linked into your application.
				298
				299	With just these two changes, lets see how Kaleidoscope works now!
				300
				301	::
				302
				303	ready> 4+5;
				304	define double @""() {
				305	entry:
				306	ret double 9.000000e+00
				307	}
				308
				309	Evaluated to 9.000000
				310
				311	Well this looks like it is basically working. The dump of the function
				312	shows the "no argument function that always returns double" that we
				313	synthesize for each top level expression that is typed in. This
				314	demonstrates very basic functionality, but can we do more?
				315
				316	::
				317
				318	ready> def testfunc(x y) x + y*2;
				319	Read function definition:
				320	define double @testfunc(double %x, double %y) {
				321	entry:
				322	%multmp = fmul double %y, 2.000000e+00
				323	%addtmp = fadd double %multmp, %x
				324	ret double %addtmp
				325	}
				326
				327	ready> testfunc(4, 10);
				328	define double @""() {
				329	entry:
				330	%calltmp = call double @testfunc(double 4.000000e+00, double 1.000000e+01)
				331	ret double %calltmp
				332	}
				333
				334	Evaluated to 24.000000
				335
				336	This illustrates that we can now call user code, but there is something
				337	a bit subtle going on here. Note that we only invoke the JIT on the
				338	anonymous functions that call testfunc, but we never invoked it on
				339	testfunc itself. What actually happened here is that the JIT scanned
				340	for all non-JIT'd functions transitively called from the anonymous
				341	function and compiled all of them before returning from
				342	``run_function``.
				343
				344	The JIT provides a number of other more advanced interfaces for things
				345	like freeing allocated machine code, rejit'ing functions to update them,
				346	etc. However, even with this simple code, we get some surprisingly
				347	powerful capabilities - check this out (I removed the dump of the
				348	anonymous functions, you should get the idea by now :) :
				349
				350	::
				351
				352	ready> extern sin(x);
				353	Read extern:
				354	declare double @sin(double)
				355
				356	ready> extern cos(x);
				357	Read extern:
				358	declare double @cos(double)
				359
				360	ready> sin(1.0);
				361	Evaluated to 0.841471
				362
				363	ready> def foo(x) sin(x)sin(x) + cos(x)cos(x);
				364	Read function definition:
				365	define double @foo(double %x) {
				366	entry:
				367	%calltmp = call double @sin(double %x)
				368	%multmp = fmul double %calltmp, %calltmp
				369	%calltmp2 = call double @cos(double %x)
				370	%multmp4 = fmul double %calltmp2, %calltmp2
				371	%addtmp = fadd double %multmp, %multmp4
				372	ret double %addtmp
				373	}
				374
				375	ready> foo(4.0);
				376	Evaluated to 1.000000
				377
				378	Whoa, how does the JIT know about sin and cos? The answer is
				379	surprisingly simple: in this example, the JIT started execution of a
				380	function and got to a function call. It realized that the function was
				381	not yet JIT compiled and invoked the standard set of routines to resolve
				382	the function. In this case, there is no body defined for the function,
				383	so the JIT ended up calling "``dlsym("sin")``" on the Kaleidoscope
				384	process itself. Since "``sin``" is defined within the JIT's address
				385	space, it simply patches up calls in the module to call the libm version
				386	of ``sin`` directly.
				387
				388	The LLVM JIT provides a number of interfaces (look in the
				389	``llvm_executionengine.mli`` file) for controlling how unknown functions
				390	get resolved. It allows you to establish explicit mappings between IR
				391	objects and addresses (useful for LLVM global variables that you want to
				392	map to static tables, for example), allows you to dynamically decide on
				393	the fly based on the function name, and even allows you to have the JIT
				394	compile functions lazily the first time they're called.
				395
				396	One interesting application of this is that we can now extend the
				397	language by writing arbitrary C code to implement operations. For
				398	example, if we add:
				399
				400	.. code-block:: c++
				401
				402	/* putchard - putchar that takes a double and returns 0. */
				403	extern "C"
				404	double putchard(double X) {
				405	putchar((char)X);
				406	return 0;
				407	}
				408
				409	Now we can produce simple output to the console by using things like:
				410	"``extern putchard(x); putchard(120);``", which prints a lowercase 'x'
				411	on the console (120 is the ASCII code for 'x'). Similar code could be
				412	used to implement file I/O, console input, and many other capabilities
				413	in Kaleidoscope.
				414
				415	This completes the JIT and optimizer chapter of the Kaleidoscope
				416	tutorial. At this point, we can compile a non-Turing-complete
				417	programming language, optimize and JIT compile it in a user-driven way.
				418	Next up we'll look into `extending the language with control flow
				419	constructs <OCamlLangImpl5.html>`_, tackling some interesting LLVM IR
				420	issues along the way.
				421
				422	Full Code Listing
				423	=================
				424
				425	Here is the complete code listing for our running example, enhanced with
				426	the LLVM JIT and optimizer. To build this example, use:
				427
				428	.. code-block:: bash
				429
				430	# Compile
				431	ocamlbuild toy.byte
				432	# Run
				433	./toy.byte
				434
				435	Here is the code:
				436
				437	\_tags:
				438	::
				439
				440	<{lexer,parser}.ml>: use_camlp4, pp(camlp4of)
				441	<*.{byte,native}>: g++, use_llvm, use_llvm_analysis
				442	<*.{byte,native}>: use_llvm_executionengine, use_llvm_target
				443	<*.{byte,native}>: use_llvm_scalar_opts, use_bindings
				444
				445	myocamlbuild.ml:
				446	.. code-block:: ocaml
				447
				448	open Ocamlbuild_plugin;;
				449
				450	ocaml_lib ~extern:true "llvm";;
				451	ocaml_lib ~extern:true "llvm_analysis";;
				452	ocaml_lib ~extern:true "llvm_executionengine";;
				453	ocaml_lib ~extern:true "llvm_target";;
				454	ocaml_lib ~extern:true "llvm_scalar_opts";;
				455
				456	flag ["link"; "ocaml"; "g++"] (S[A"-cc"; A"g++"]);;
				457	dep ["link"; "ocaml"; "use_bindings"] ["bindings.o"];;
				458
				459	token.ml:
				460	.. code-block:: ocaml
				461
				462	(*===----------------------------------------------------------------------===
				463	* Lexer Tokens
				464	===----------------------------------------------------------------------===)
				465
				466	(* The lexer returns these 'Kwd' if it is an unknown character, otherwise one of
				467	* these others for known things. *)
				468	type token =
				469	(* commands *)
				470	\| Def \| Extern
				471
				472	(* primary *)
				473	\| Ident of string \| Number of float
				474
				475	(* unknown *)
				476	\| Kwd of char
				477
				478	lexer.ml:
				479	.. code-block:: ocaml
				480
				481	(*===----------------------------------------------------------------------===
				482	* Lexer
				483	===----------------------------------------------------------------------===)
				484
				485	let rec lex = parser
				486	(* Skip any whitespace. *)
				487	\| [< ' (' ' \| '\n' \| '\r' \| '\t'); stream >] -> lex stream
				488
				489	(* identifier: [a-zA-Z][a-zA-Z0-9] *)
				490	\| [< ' ('A' .. 'Z' \| 'a' .. 'z' as c); stream >] ->
				491	let buffer = Buffer.create 1 in
				492	Buffer.add_char buffer c;
				493	lex_ident buffer stream
				494
				495	(* number: [0-9.]+ *)
				496	\| [< ' ('0' .. '9' as c); stream >] ->
				497	let buffer = Buffer.create 1 in
				498	Buffer.add_char buffer c;
				499	lex_number buffer stream
				500
				501	(* Comment until end of line. *)
				502	\| [< ' ('#'); stream >] ->
				503	lex_comment stream
				504
				505	(* Otherwise, just return the character as its ascii value. *)
				506	\| [< 'c; stream >] ->
				507	[< 'Token.Kwd c; lex stream >]
				508
				509	(* end of stream. *)
				510	\| [< >] -> [< >]
				511
				512	and lex_number buffer = parser
				513	\| [< ' ('0' .. '9' \| '.' as c); stream >] ->
				514	Buffer.add_char buffer c;
				515	lex_number buffer stream
				516	\| [< stream=lex >] ->
				517	[< 'Token.Number (float_of_string (Buffer.contents buffer)); stream >]
				518
				519	and lex_ident buffer = parser
				520	\| [< ' ('A' .. 'Z' \| 'a' .. 'z' \| '0' .. '9' as c); stream >] ->
				521	Buffer.add_char buffer c;
				522	lex_ident buffer stream
				523	\| [< stream=lex >] ->
				524	match Buffer.contents buffer with
				525	\| "def" -> [< 'Token.Def; stream >]
				526	\| "extern" -> [< 'Token.Extern; stream >]
				527	\| id -> [< 'Token.Ident id; stream >]
				528
				529	and lex_comment = parser
				530	\| [< ' ('\n'); stream=lex >] -> stream
				531	\| [< 'c; e=lex_comment >] -> e
				532	\| [< >] -> [< >]
				533
				534	ast.ml:
				535	.. code-block:: ocaml
				536
				537	(*===----------------------------------------------------------------------===
				538	* Abstract Syntax Tree (aka Parse Tree)
				539	===----------------------------------------------------------------------===)
				540
				541	(* expr - Base type for all expression nodes. *)
				542	type expr =
				543	(* variant for numeric literals like "1.0". *)
				544	\| Number of float
				545
				546	(* variant for referencing a variable, like "a". *)
				547	\| Variable of string
				548
				549	(* variant for a binary operator. *)
				550	\| Binary of char * expr * expr
				551
				552	(* variant for function calls. *)
				553	\| Call of string * expr array
				554
				555	(* proto - This type represents the "prototype" for a function, which captures
				556	* its name, and its argument names (thus implicitly the number of arguments the
				557	* function takes). *)
				558	type proto = Prototype of string * string array
				559
				560	(* func - This type represents a function definition itself. *)
				561	type func = Function of proto * expr
				562
				563	parser.ml:
				564	.. code-block:: ocaml
				565
				566	(*===---------------------------------------------------------------------===
				567	* Parser
				568	===---------------------------------------------------------------------===)
				569
				570	(* binop_precedence - This holds the precedence for each binary operator that is
				571	* defined *)
				572	let binop_precedence:(char, int) Hashtbl.t = Hashtbl.create 10
				573
				574	(* precedence - Get the precedence of the pending binary operator token. *)
				575	let precedence c = try Hashtbl.find binop_precedence c with Not_found -> -1
				576
				577	(* primary
				578	* ::= identifier
				579	* ::= numberexpr
				580	* ::= parenexpr *)
				581	let rec parse_primary = parser
				582	(* numberexpr ::= number *)
				583	\| [< 'Token.Number n >] -> Ast.Number n
				584
				585	(* parenexpr ::= '(' expression ')' *)
				586	\| [< 'Token.Kwd '('; e=parse_expr; 'Token.Kwd ')' ?? "expected ')'" >] -> e
				587
				588	(* identifierexpr
				589	* ::= identifier
				590	* ::= identifier '(' argumentexpr ')' *)
				591	\| [< 'Token.Ident id; stream >] ->
				592	let rec parse_args accumulator = parser
				593	\| [< e=parse_expr; stream >] ->
				594	begin parser
				595	\| [< 'Token.Kwd ','; e=parse_args (e :: accumulator) >] -> e
				596	\| [< >] -> e :: accumulator
				597	end stream
				598	\| [< >] -> accumulator
				599	in
				600	let rec parse_ident id = parser
				601	(* Call. *)
				602	\| [< 'Token.Kwd '(';
				603	args=parse_args [];
				604	'Token.Kwd ')' ?? "expected ')'">] ->
				605	Ast.Call (id, Array.of_list (List.rev args))
				606
				607	(* Simple variable ref. *)
				608	\| [< >] -> Ast.Variable id
				609	in
				610	parse_ident id stream
				611
				612	\| [< >] -> raise (Stream.Error "unknown token when expecting an expression.")
				613
				614	(* binoprhs
				615	* ::= ('+' primary)* *)
				616	and parse_bin_rhs expr_prec lhs stream =
				617	match Stream.peek stream with
				618	(* If this is a binop, find its precedence. *)
				619	\| Some (Token.Kwd c) when Hashtbl.mem binop_precedence c ->
				620	let token_prec = precedence c in
				621
				622	(* If this is a binop that binds at least as tightly as the current binop,
				623	* consume it, otherwise we are done. *)
				624	if token_prec < expr_prec then lhs else begin
				625	(* Eat the binop. *)
				626	Stream.junk stream;
				627
				628	(* Parse the primary expression after the binary operator. *)
				629	let rhs = parse_primary stream in
				630
				631	(* Okay, we know this is a binop. *)
				632	let rhs =
				633	match Stream.peek stream with
				634	\| Some (Token.Kwd c2) ->
				635	(* If BinOp binds less tightly with rhs than the operator after
				636	* rhs, let the pending operator take rhs as its lhs. *)
				637	let next_prec = precedence c2 in
				638	if token_prec < next_prec
				639	then parse_bin_rhs (token_prec + 1) rhs stream
				640	else rhs
				641	\| _ -> rhs
				642	in
				643
				644	(* Merge lhs/rhs. *)
				645	let lhs = Ast.Binary (c, lhs, rhs) in
				646	parse_bin_rhs expr_prec lhs stream
				647	end
				648	\| _ -> lhs
				649
				650	(* expression
				651	* ::= primary binoprhs *)
				652	and parse_expr = parser
				653	\| [< lhs=parse_primary; stream >] -> parse_bin_rhs 0 lhs stream
				654
				655	(* prototype
				656	* ::= id '(' id* ')' *)
				657	let parse_prototype =
				658	let rec parse_args accumulator = parser
				659	\| [< 'Token.Ident id; e=parse_args (id::accumulator) >] -> e
				660	\| [< >] -> accumulator
				661	in
				662
				663	parser
				664	\| [< 'Token.Ident id;
				665	'Token.Kwd '(' ?? "expected '(' in prototype";
				666	args=parse_args [];
				667	'Token.Kwd ')' ?? "expected ')' in prototype" >] ->
				668	(* success. *)
				669	Ast.Prototype (id, Array.of_list (List.rev args))
				670
				671	\| [< >] ->
				672	raise (Stream.Error "expected function name in prototype")
				673
				674	(* definition ::= 'def' prototype expression *)
				675	let parse_definition = parser
				676	\| [< 'Token.Def; p=parse_prototype; e=parse_expr >] ->
				677	Ast.Function (p, e)
				678
				679	(* toplevelexpr ::= expression *)
				680	let parse_toplevel = parser
				681	\| [< e=parse_expr >] ->
				682	(* Make an anonymous proto. *)
				683	Ast.Function (Ast.Prototype ("", [\|\|]), e)
				684
				685	(* external ::= 'extern' prototype *)
				686	let parse_extern = parser
				687	\| [< 'Token.Extern; e=parse_prototype >] -> e
				688
				689	codegen.ml:
				690	.. code-block:: ocaml
				691
				692	(*===----------------------------------------------------------------------===
				693	* Code Generation
				694	===----------------------------------------------------------------------===)
				695
				696	open Llvm
				697
				698	exception Error of string
				699
				700	let context = global_context ()
				701	let the_module = create_module context "my cool jit"
				702	let builder = builder context
				703	let named_values:(string, llvalue) Hashtbl.t = Hashtbl.create 10
				704	let double_type = double_type context
				705
				706	let rec codegen_expr = function
				707	\| Ast.Number n -> const_float double_type n
				708	\| Ast.Variable name ->
				709	(try Hashtbl.find named_values name with
				710	\| Not_found -> raise (Error "unknown variable name"))
				711	\| Ast.Binary (op, lhs, rhs) ->
				712	let lhs_val = codegen_expr lhs in
				713	let rhs_val = codegen_expr rhs in
				714	begin
				715	match op with
				716	\| '+' -> build_add lhs_val rhs_val "addtmp" builder
				717	\| '-' -> build_sub lhs_val rhs_val "subtmp" builder
				718	\| '*' -> build_mul lhs_val rhs_val "multmp" builder
				719	\| '<' ->
				720	(* Convert bool 0/1 to double 0.0 or 1.0 *)
				721	let i = build_fcmp Fcmp.Ult lhs_val rhs_val "cmptmp" builder in
				722	build_uitofp i double_type "booltmp" builder
				723	\| _ -> raise (Error "invalid binary operator")
				724	end
				725	\| Ast.Call (callee, args) ->
				726	(* Look up the name in the module table. *)
				727	let callee =
				728	match lookup_function callee the_module with
				729	\| Some callee -> callee
				730	\| None -> raise (Error "unknown function referenced")
				731	in
				732	let params = params callee in
				733
				734	(* If argument mismatch error. *)
				735	if Array.length params == Array.length args then () else
				736	raise (Error "incorrect # arguments passed");
				737	let args = Array.map codegen_expr args in
				738	build_call callee args "calltmp" builder
				739
				740	let codegen_proto = function
				741	\| Ast.Prototype (name, args) ->
				742	(* Make the function type: double(double,double) etc. *)
				743	let doubles = Array.make (Array.length args) double_type in
				744	let ft = function_type double_type doubles in
				745	let f =
				746	match lookup_function name the_module with
				747	\| None -> declare_function name ft the_module
				748
				749	(* If 'f' conflicted, there was already something named 'name'. If it
				750	* has a body, don't allow redefinition or reextern. *)
				751	\| Some f ->
				752	(* If 'f' already has a body, reject this. *)
				753	if block_begin f <> At_end f then
				754	raise (Error "redefinition of function");
				755
				756	(* If 'f' took a different number of arguments, reject. *)
				757	if element_type (type_of f) <> ft then
				758	raise (Error "redefinition of function with different # args");
				759	f
				760	in
				761
				762	(* Set names for all arguments. *)
				763	Array.iteri (fun i a ->
				764	let n = args.(i) in
				765	set_value_name n a;
				766	Hashtbl.add named_values n a;
				767	) (params f);
				768	f
				769
				770	let codegen_func the_fpm = function
				771	\| Ast.Function (proto, body) ->
				772	Hashtbl.clear named_values;
				773	let the_function = codegen_proto proto in
				774
				775	(* Create a new basic block to start insertion into. *)
				776	let bb = append_block context "entry" the_function in
				777	position_at_end bb builder;
				778
				779	try
				780	let ret_val = codegen_expr body in
				781
				782	(* Finish off the function. *)
				783	let _ = build_ret ret_val builder in
				784
				785	(* Validate the generated code, checking for consistency. *)
				786	Llvm_analysis.assert_valid_function the_function;
				787
				788	(* Optimize the function. *)
				789	let _ = PassManager.run_function the_function the_fpm in
				790
				791	the_function
				792	with e ->
				793	delete_function the_function;
				794	raise e
				795
				796	toplevel.ml:
				797	.. code-block:: ocaml
				798
				799	(*===----------------------------------------------------------------------===
				800	* Top-Level parsing and JIT Driver
				801	===----------------------------------------------------------------------===)
				802
				803	open Llvm
				804	open Llvm_executionengine
				805
				806	(* top ::= definition \| external \| expression \| ';' *)
				807	let rec main_loop the_fpm the_execution_engine stream =
				808	match Stream.peek stream with
				809	\| None -> ()
				810
				811	(* ignore top-level semicolons. *)
				812	\| Some (Token.Kwd ';') ->
				813	Stream.junk stream;
				814	main_loop the_fpm the_execution_engine stream
				815
				816	\| Some token ->
				817	begin
				818	try match token with
				819	\| Token.Def ->
				820	let e = Parser.parse_definition stream in
				821	print_endline "parsed a function definition.";
				822	dump_value (Codegen.codegen_func the_fpm e);
				823	\| Token.Extern ->
				824	let e = Parser.parse_extern stream in
				825	print_endline "parsed an extern.";
				826	dump_value (Codegen.codegen_proto e);
				827	\| _ ->
				828	(* Evaluate a top-level expression into an anonymous function. *)
				829	let e = Parser.parse_toplevel stream in
				830	print_endline "parsed a top-level expr";
				831	let the_function = Codegen.codegen_func the_fpm e in
				832	dump_value the_function;
				833
				834	(* JIT the function, returning a function pointer. *)
				835	let result = ExecutionEngine.run_function the_function [\|\|]
				836	the_execution_engine in
				837
				838	print_string "Evaluated to ";
				839	print_float (GenericValue.as_float Codegen.double_type result);
				840	print_newline ();
				841	with Stream.Error s \| Codegen.Error s ->
				842	(* Skip token for error recovery. *)
				843	Stream.junk stream;
				844	print_endline s;
				845	end;
				846	print_string "ready> "; flush stdout;
				847	main_loop the_fpm the_execution_engine stream
				848
				849	toy.ml:
				850	.. code-block:: ocaml
				851
				852	(*===----------------------------------------------------------------------===
				853	* Main driver code.
				854	===----------------------------------------------------------------------===)
				855
				856	open Llvm
				857	open Llvm_executionengine
				858	open Llvm_target
				859	open Llvm_scalar_opts
				860
				861	let main () =
				862	ignore (initialize_native_target ());
				863
				864	(* Install standard binary operators.
				865	* 1 is the lowest precedence. *)
				866	Hashtbl.add Parser.binop_precedence '<' 10;
				867	Hashtbl.add Parser.binop_precedence '+' 20;
				868	Hashtbl.add Parser.binop_precedence '-' 20;
				869	Hashtbl.add Parser.binop_precedence '' 40; ( highest. *)
				870
				871	(* Prime the first token. *)
				872	print_string "ready> "; flush stdout;
				873	let stream = Lexer.lex (Stream.of_channel stdin) in
				874
				875	(* Create the JIT. *)
				876	let the_execution_engine = ExecutionEngine.create Codegen.the_module in
				877	let the_fpm = PassManager.create_function Codegen.the_module in
				878
				879	(* Set up the optimizer pipeline. Start with registering info about how the
				880	* target lays out data structures. *)
				881	DataLayout.add (ExecutionEngine.target_data the_execution_engine) the_fpm;
				882
				883	(* Do simple "peephole" optimizations and bit-twiddling optzn. *)
				884	add_instruction_combination the_fpm;
				885
				886	(* reassociate expressions. *)
				887	add_reassociation the_fpm;
				888
				889	(* Eliminate Common SubExpressions. *)
				890	add_gvn the_fpm;
				891
				892	(* Simplify the control flow graph (deleting unreachable blocks, etc). *)
				893	add_cfg_simplification the_fpm;
				894
				895	ignore (PassManager.initialize the_fpm);
				896
				897	(* Run the main "interpreter loop" now. *)
				898	Toplevel.main_loop the_fpm the_execution_engine stream;
				899
				900	(* Print out all the generated code. *)
				901	dump_module Codegen.the_module
				902	;;
				903
				904	main ()
				905
				906	bindings.c
				907	.. code-block:: c
				908
				909	#include <stdio.h>
				910
				911	/* putchard - putchar that takes a double and returns 0. */
				912	extern double putchard(double X) {
				913	putchar((char)X);
				914	return 0;
				915	}
				916
				917	`Next: Extending the language: control flow <OCamlLangImpl5.html>`_
				918