Blame - llvm/docs/tutorial/MyFirstLanguageFrontend/LangImpl07.rst - toolchain/llvm-project

blob: a30be9894c3abf486bb5ae489253684637ef2c25 [file] [log] [blame]

Chris Lattner	d80f118	2019-04-07 13:14:23 +0000	[diff] [blame]	1	=======================================================
				2	Kaleidoscope: Extending the Language: Mutable Variables
				3	=======================================================
				4
				5	.. contents::
				6	:local:
				7
				8	Chapter 7 Introduction
				9	======================
				10
				11	Welcome to Chapter 7 of the "`Implementing a language with
				12	LLVM <index.html>`_" tutorial. In chapters 1 through 6, we've built a
				13	very respectable, albeit simple, `functional programming
				14	language <http://en.wikipedia.org/wiki/Functional_programming>`_. In our
				15	journey, we learned some parsing techniques, how to build and represent
				16	an AST, how to build LLVM IR, and how to optimize the resultant code as
				17	well as JIT compile it.
				18
				19	While Kaleidoscope is interesting as a functional language, the fact
				20	that it is functional makes it "too easy" to generate LLVM IR for it. In
				21	particular, a functional language makes it very easy to build LLVM IR
				22	directly in `SSA
				23	form <http://en.wikipedia.org/wiki/Static_single_assignment_form>`_.
				24	Since LLVM requires that the input code be in SSA form, this is a very
				25	nice property and it is often unclear to newcomers how to generate code
				26	for an imperative language with mutable variables.
				27
				28	The short (and happy) summary of this chapter is that there is no need
				29	for your front-end to build SSA form: LLVM provides highly tuned and
				30	well tested support for this, though the way it works is a bit
				31	unexpected for some.
				32
				33	Why is this a hard problem?
				34	===========================
				35
				36	To understand why mutable variables cause complexities in SSA
				37	construction, consider this extremely simple C example:
				38
				39	.. code-block:: c
				40
				41	int G, H;
				42	int test(_Bool Condition) {
				43	int X;
				44	if (Condition)
				45	X = G;
				46	else
				47	X = H;
				48	return X;
				49	}
				50
				51	In this case, we have the variable "X", whose value depends on the path
				52	executed in the program. Because there are two different possible values
				53	for X before the return instruction, a PHI node is inserted to merge the
				54	two values. The LLVM IR that we want for this example looks like this:
				55
				56	.. code-block:: llvm
				57
				58	@G = weak global i32 0 ; type of @G is i32*
				59	@H = weak global i32 0 ; type of @H is i32*
				60
				61	define i32 @test(i1 %Condition) {
				62	entry:
				63	br i1 %Condition, label %cond_true, label %cond_false
				64
				65	cond_true:
				66	%X.0 = load i32* @G
				67	br label %cond_next
				68
				69	cond_false:
				70	%X.1 = load i32* @H
				71	br label %cond_next
				72
				73	cond_next:
				74	%X.2 = phi i32 [ %X.1, %cond_false ], [ %X.0, %cond_true ]
				75	ret i32 %X.2
				76	}
				77
				78	In this example, the loads from the G and H global variables are
				79	explicit in the LLVM IR, and they live in the then/else branches of the
				80	if statement (cond\_true/cond\_false). In order to merge the incoming
				81	values, the X.2 phi node in the cond\_next block selects the right value
				82	to use based on where control flow is coming from: if control flow comes
				83	from the cond\_false block, X.2 gets the value of X.1. Alternatively, if
				84	control flow comes from cond\_true, it gets the value of X.0. The intent
				85	of this chapter is not to explain the details of SSA form. For more
				86	information, see one of the many `online
				87	references <http://en.wikipedia.org/wiki/Static_single_assignment_form>`_.
				88
				89	The question for this article is "who places the phi nodes when lowering
				90	assignments to mutable variables?". The issue here is that LLVM
				91	requires that its IR be in SSA form: there is no "non-ssa" mode for
				92	it. However, SSA construction requires non-trivial algorithms and data
				93	structures, so it is inconvenient and wasteful for every front-end to
				94	have to reproduce this logic.
				95
				96	Memory in LLVM
				97	==============
				98
				99	The 'trick' here is that while LLVM does require all register values to
				100	be in SSA form, it does not require (or permit) memory objects to be in
				101	SSA form. In the example above, note that the loads from G and H are
				102	direct accesses to G and H: they are not renamed or versioned. This
				103	differs from some other compiler systems, which do try to version memory
				104	objects. In LLVM, instead of encoding dataflow analysis of memory into
				105	the LLVM IR, it is handled with `Analysis
kristina	2916489	2019-11-16 20:58:08 +0000	[diff] [blame]	106	Passes <../../WritingAnLLVMPass.html>`_ which are computed on demand.
Chris Lattner	d80f118	2019-04-07 13:14:23 +0000	[diff] [blame]	107
				108	With this in mind, the high-level idea is that we want to make a stack
				109	variable (which lives in memory, because it is on the stack) for each
				110	mutable object in a function. To take advantage of this trick, we need
				111	to talk about how LLVM represents stack variables.
				112
				113	In LLVM, all memory accesses are explicit with load/store instructions,
				114	and it is carefully designed not to have (or need) an "address-of"
				115	operator. Notice how the type of the @G/@H global variables is actually
				116	"i32\*" even though the variable is defined as "i32". What this means is
				117	that @G defines space for an i32 in the global data area, but its
				118	name actually refers to the address for that space. Stack variables
				119	work the same way, except that instead of being declared with global
				120	variable definitions, they are declared with the `LLVM alloca
kristina	2916489	2019-11-16 20:58:08 +0000	[diff] [blame]	121	instruction <../../LangRef.html#alloca-instruction>`_:
Chris Lattner	d80f118	2019-04-07 13:14:23 +0000	[diff] [blame]	122
				123	.. code-block:: llvm
				124
				125	define i32 @example() {
				126	entry:
				127	%X = alloca i32 ; type of %X is i32*.
				128	...
				129	%tmp = load i32* %X ; load the stack value %X from the stack.
				130	%tmp2 = add i32 %tmp, 1 ; increment it
				131	store i32 %tmp2, i32* %X ; store it back
				132	...
				133
				134	This code shows an example of how you can declare and manipulate a stack
				135	variable in the LLVM IR. Stack memory allocated with the alloca
				136	instruction is fully general: you can pass the address of the stack slot
				137	to functions, you can store it in other variables, etc. In our example
				138	above, we could rewrite the example to use the alloca technique to avoid
				139	using a PHI node:
				140
				141	.. code-block:: llvm
				142
				143	@G = weak global i32 0 ; type of @G is i32*
				144	@H = weak global i32 0 ; type of @H is i32*
				145
				146	define i32 @test(i1 %Condition) {
				147	entry:
				148	%X = alloca i32 ; type of %X is i32*.
				149	br i1 %Condition, label %cond_true, label %cond_false
				150
				151	cond_true:
				152	%X.0 = load i32* @G
				153	store i32 %X.0, i32* %X ; Update X
				154	br label %cond_next
				155
				156	cond_false:
				157	%X.1 = load i32* @H
				158	store i32 %X.1, i32* %X ; Update X
				159	br label %cond_next
				160
				161	cond_next:
				162	%X.2 = load i32* %X ; Read X
				163	ret i32 %X.2
				164	}
				165
				166	With this, we have discovered a way to handle arbitrary mutable
				167	variables without the need to create Phi nodes at all:
				168
				169	#. Each mutable variable becomes a stack allocation.
				170	#. Each read of the variable becomes a load from the stack.
				171	#. Each update of the variable becomes a store to the stack.
				172	#. Taking the address of a variable just uses the stack address
				173	directly.
				174
				175	While this solution has solved our immediate problem, it introduced
				176	another one: we have now apparently introduced a lot of stack traffic
				177	for very simple and common operations, a major performance problem.
				178	Fortunately for us, the LLVM optimizer has a highly-tuned optimization
				179	pass named "mem2reg" that handles this case, promoting allocas like this
				180	into SSA registers, inserting Phi nodes as appropriate. If you run this
				181	example through the pass, for example, you'll get:
				182
				183	.. code-block:: bash
				184
				185	$ llvm-as < example.ll \| opt -mem2reg \| llvm-dis
				186	@G = weak global i32 0
				187	@H = weak global i32 0
				188
				189	define i32 @test(i1 %Condition) {
				190	entry:
				191	br i1 %Condition, label %cond_true, label %cond_false
				192
				193	cond_true:
				194	%X.0 = load i32* @G
				195	br label %cond_next
				196
				197	cond_false:
				198	%X.1 = load i32* @H
				199	br label %cond_next
				200
				201	cond_next:
				202	%X.01 = phi i32 [ %X.1, %cond_false ], [ %X.0, %cond_true ]
				203	ret i32 %X.01
				204	}
				205
				206	The mem2reg pass implements the standard "iterated dominance frontier"
				207	algorithm for constructing SSA form and has a number of optimizations
				208	that speed up (very common) degenerate cases. The mem2reg optimization
				209	pass is the answer to dealing with mutable variables, and we highly
				210	recommend that you depend on it. Note that mem2reg only works on
				211	variables in certain circumstances:
				212
				213	#. mem2reg is alloca-driven: it looks for allocas and if it can handle
				214	them, it promotes them. It does not apply to global variables or heap
				215	allocations.
				216	#. mem2reg only looks for alloca instructions in the entry block of the
				217	function. Being in the entry block guarantees that the alloca is only
				218	executed once, which makes analysis simpler.
				219	#. mem2reg only promotes allocas whose uses are direct loads and stores.
				220	If the address of the stack object is passed to a function, or if any
				221	funny pointer arithmetic is involved, the alloca will not be
				222	promoted.
				223	#. mem2reg only works on allocas of `first
kristina	2916489	2019-11-16 20:58:08 +0000	[diff] [blame]	224	class <../../LangRef.html#first-class-types>`_ values (such as pointers,
Chris Lattner	d80f118	2019-04-07 13:14:23 +0000	[diff] [blame]	225	scalars and vectors), and only if the array size of the allocation is
				226	1 (or missing in the .ll file). mem2reg is not capable of promoting
				227	structs or arrays to registers. Note that the "sroa" pass is
				228	more powerful and can promote structs, "unions", and arrays in many
				229	cases.
				230
				231	All of these properties are easy to satisfy for most imperative
				232	languages, and we'll illustrate it below with Kaleidoscope. The final
				233	question you may be asking is: should I bother with this nonsense for my
				234	front-end? Wouldn't it be better if I just did SSA construction
				235	directly, avoiding use of the mem2reg optimization pass? In short, we
				236	strongly recommend that you use this technique for building SSA form,
				237	unless there is an extremely good reason not to. Using this technique
				238	is:
				239
				240	- Proven and well tested: clang uses this technique
				241	for local mutable variables. As such, the most common clients of LLVM
				242	are using this to handle a bulk of their variables. You can be sure
				243	that bugs are found fast and fixed early.
				244	- Extremely Fast: mem2reg has a number of special cases that make it
				245	fast in common cases as well as fully general. For example, it has
				246	fast-paths for variables that are only used in a single block,
				247	variables that only have one assignment point, good heuristics to
				248	avoid insertion of unneeded phi nodes, etc.
				249	- Needed for debug info generation: `Debug information in
kristina	2916489	2019-11-16 20:58:08 +0000	[diff] [blame]	250	LLVM <../../SourceLevelDebugging.html>`_ relies on having the address of
Chris Lattner	d80f118	2019-04-07 13:14:23 +0000	[diff] [blame]	251	the variable exposed so that debug info can be attached to it. This
				252	technique dovetails very naturally with this style of debug info.
				253
				254	If nothing else, this makes it much easier to get your front-end up and
				255	running, and is very simple to implement. Let's extend Kaleidoscope with
				256	mutable variables now!
				257
				258	Mutable Variables in Kaleidoscope
				259	=================================
				260
				261	Now that we know the sort of problem we want to tackle, let's see what
				262	this looks like in the context of our little Kaleidoscope language.
				263	We're going to add two features:
				264
				265	#. The ability to mutate variables with the '=' operator.
				266	#. The ability to define new variables.
				267
				268	While the first item is really what this is about, we only have
				269	variables for incoming arguments as well as for induction variables, and
				270	redefining those only goes so far :). Also, the ability to define new
				271	variables is a useful thing regardless of whether you will be mutating
				272	them. Here's a motivating example that shows how we could use these:
				273
				274	::
				275
				276	# Define ':' for sequencing: as a low-precedence operator that ignores operands
				277	# and just returns the RHS.
				278	def binary : 1 (x y) y;
				279
				280	# Recursive fib, we could do this before.
				281	def fib(x)
				282	if (x < 3) then
				283	1
				284	else
				285	fib(x-1)+fib(x-2);
				286
				287	# Iterative fib.
				288	def fibi(x)
				289	var a = 1, b = 1, c in
				290	(for i = 3, i < x in
				291	c = a + b :
				292	a = b :
				293	b = c) :
				294	b;
				295
				296	# Call it.
				297	fibi(10);
				298
				299	In order to mutate variables, we have to change our existing variables
				300	to use the "alloca trick". Once we have that, we'll add our new
				301	operator, then extend Kaleidoscope to support new variable definitions.
				302
				303	Adjusting Existing Variables for Mutation
				304	=========================================
				305
				306	The symbol table in Kaleidoscope is managed at code generation time by
				307	the '``NamedValues``' map. This map currently keeps track of the LLVM
				308	"Value\*" that holds the double value for the named variable. In order
				309	to support mutation, we need to change this slightly, so that
				310	``NamedValues`` holds the memory location of the variable in question.
				311	Note that this change is a refactoring: it changes the structure of the
				312	code, but does not (by itself) change the behavior of the compiler. All
				313	of these changes are isolated in the Kaleidoscope code generator.
				314
				315	At this point in Kaleidoscope's development, it only supports variables
				316	for two things: incoming arguments to functions and the induction
				317	variable of 'for' loops. For consistency, we'll allow mutation of these
				318	variables in addition to other user-defined variables. This means that
				319	these will both need memory locations.
				320
				321	To start our transformation of Kaleidoscope, we'll change the
				322	NamedValues map so that it maps to AllocaInst\* instead of Value\*. Once
				323	we do this, the C++ compiler will tell us what parts of the code we need
				324	to update:
				325
				326	.. code-block:: c++
				327
				328	static std::map<std::string, AllocaInst*> NamedValues;
				329
				330	Also, since we will need to create these allocas, we'll use a helper
				331	function that ensures that the allocas are created in the entry block of
				332	the function:
				333
				334	.. code-block:: c++
				335
				336	/// CreateEntryBlockAlloca - Create an alloca instruction in the entry block of
				337	/// the function. This is used for mutable variables etc.
				338	static AllocaInst CreateEntryBlockAlloca(Function TheFunction,
				339	const std::string &VarName) {
				340	IRBuilder<> TmpB(&TheFunction->getEntryBlock(),
				341	TheFunction->getEntryBlock().begin());
				342	return TmpB.CreateAlloca(Type::getDoubleTy(TheContext), 0,
				343	VarName.c_str());
				344	}
				345
				346	This funny looking code creates an IRBuilder object that is pointing at
				347	the first instruction (.begin()) of the entry block. It then creates an
				348	alloca with the expected name and returns it. Because all values in
				349	Kaleidoscope are doubles, there is no need to pass in a type to use.
				350
				351	With this in place, the first functionality change we want to make belongs to
				352	variable references. In our new scheme, variables live on the stack, so
				353	code generating a reference to them actually needs to produce a load
				354	from the stack slot:
				355
				356	.. code-block:: c++
				357
				358	Value *VariableExprAST::codegen() {
				359	// Look this variable up in the function.
				360	Value *V = NamedValues[Name];
				361	if (!V)
				362	return LogErrorV("Unknown variable name");
				363
				364	// Load the value.
				365	return Builder.CreateLoad(V, Name.c_str());
				366	}
				367
				368	As you can see, this is pretty straightforward. Now we need to update
				369	the things that define the variables to set up the alloca. We'll start
				370	with ``ForExprAST::codegen()`` (see the `full code listing <#id1>`_ for
				371	the unabridged code):
				372
				373	.. code-block:: c++
				374
				375	Function *TheFunction = Builder.GetInsertBlock()->getParent();
				376
				377	// Create an alloca for the variable in the entry block.
				378	AllocaInst *Alloca = CreateEntryBlockAlloca(TheFunction, VarName);
				379
				380	// Emit the start code first, without 'variable' in scope.
				381	Value *StartVal = Start->codegen();
				382	if (!StartVal)
				383	return nullptr;
				384
				385	// Store the value into the alloca.
				386	Builder.CreateStore(StartVal, Alloca);
				387	...
				388
				389	// Compute the end condition.
				390	Value *EndCond = End->codegen();
				391	if (!EndCond)
				392	return nullptr;
				393
				394	// Reload, increment, and restore the alloca. This handles the case where
				395	// the body of the loop mutates the variable.
				396	Value *CurVar = Builder.CreateLoad(Alloca);
				397	Value *NextVar = Builder.CreateFAdd(CurVar, StepVal, "nextvar");
				398	Builder.CreateStore(NextVar, Alloca);
				399	...
				400
				401	This code is virtually identical to the code `before we allowed mutable
				402	variables <LangImpl5.html#code-generation-for-the-for-loop>`_. The big difference is that we
				403	no longer have to construct a PHI node, and we use load/store to access
				404	the variable as needed.
				405
				406	To support mutable argument variables, we need to also make allocas for
				407	them. The code for this is also pretty simple:
				408
				409	.. code-block:: c++
				410
				411	Function *FunctionAST::codegen() {
				412	...
				413	Builder.SetInsertPoint(BB);
				414
				415	// Record the function arguments in the NamedValues map.
				416	NamedValues.clear();
				417	for (auto &Arg : TheFunction->args()) {
				418	// Create an alloca for this variable.
				419	AllocaInst *Alloca = CreateEntryBlockAlloca(TheFunction, Arg.getName());
				420
				421	// Store the initial value into the alloca.
				422	Builder.CreateStore(&Arg, Alloca);
				423
				424	// Add arguments to variable symbol table.
				425	NamedValues[Arg.getName()] = Alloca;
				426	}
				427
				428	if (Value *RetVal = Body->codegen()) {
				429	...
				430
				431	For each argument, we make an alloca, store the input value to the
				432	function into the alloca, and register the alloca as the memory location
				433	for the argument. This method gets invoked by ``FunctionAST::codegen()``
				434	right after it sets up the entry block for the function.
				435
				436	The final missing piece is adding the mem2reg pass, which allows us to
				437	get good codegen once again:
				438
				439	.. code-block:: c++
				440
				441	// Promote allocas to registers.
				442	TheFPM->add(createPromoteMemoryToRegisterPass());
				443	// Do simple "peephole" optimizations and bit-twiddling optzns.
				444	TheFPM->add(createInstructionCombiningPass());
				445	// Reassociate expressions.
				446	TheFPM->add(createReassociatePass());
				447	...
				448
				449	It is interesting to see what the code looks like before and after the
				450	mem2reg optimization runs. For example, this is the before/after code
				451	for our recursive fib function. Before the optimization:
				452
				453	.. code-block:: llvm
				454
				455	define double @fib(double %x) {
				456	entry:
				457	%x1 = alloca double
				458	store double %x, double* %x1
				459	%x2 = load double, double* %x1
				460	%cmptmp = fcmp ult double %x2, 3.000000e+00
				461	%booltmp = uitofp i1 %cmptmp to double
				462	%ifcond = fcmp one double %booltmp, 0.000000e+00
				463	br i1 %ifcond, label %then, label %else
				464
				465	then: ; preds = %entry
				466	br label %ifcont
				467
				468	else: ; preds = %entry
				469	%x3 = load double, double* %x1
				470	%subtmp = fsub double %x3, 1.000000e+00
				471	%calltmp = call double @fib(double %subtmp)
				472	%x4 = load double, double* %x1
				473	%subtmp5 = fsub double %x4, 2.000000e+00
				474	%calltmp6 = call double @fib(double %subtmp5)
				475	%addtmp = fadd double %calltmp, %calltmp6
				476	br label %ifcont
				477
				478	ifcont: ; preds = %else, %then
				479	%iftmp = phi double [ 1.000000e+00, %then ], [ %addtmp, %else ]
				480	ret double %iftmp
				481	}
				482
				483	Here there is only one variable (x, the input argument) but you can
				484	still see the extremely simple-minded code generation strategy we are
				485	using. In the entry block, an alloca is created, and the initial input
				486	value is stored into it. Each reference to the variable does a reload
				487	from the stack. Also, note that we didn't modify the if/then/else
				488	expression, so it still inserts a PHI node. While we could make an
				489	alloca for it, it is actually easier to create a PHI node for it, so we
				490	still just make the PHI.
				491
				492	Here is the code after the mem2reg pass runs:
				493
				494	.. code-block:: llvm
				495
				496	define double @fib(double %x) {
				497	entry:
				498	%cmptmp = fcmp ult double %x, 3.000000e+00
				499	%booltmp = uitofp i1 %cmptmp to double
				500	%ifcond = fcmp one double %booltmp, 0.000000e+00
				501	br i1 %ifcond, label %then, label %else
				502
				503	then:
				504	br label %ifcont
				505
				506	else:
				507	%subtmp = fsub double %x, 1.000000e+00
				508	%calltmp = call double @fib(double %subtmp)
				509	%subtmp5 = fsub double %x, 2.000000e+00
				510	%calltmp6 = call double @fib(double %subtmp5)
				511	%addtmp = fadd double %calltmp, %calltmp6
				512	br label %ifcont
				513
				514	ifcont: ; preds = %else, %then
				515	%iftmp = phi double [ 1.000000e+00, %then ], [ %addtmp, %else ]
				516	ret double %iftmp
				517	}
				518
				519	This is a trivial case for mem2reg, since there are no redefinitions of
				520	the variable. The point of showing this is to calm your tension about
Nico Weber	bb69208	2019-09-13 14:58:24 +0000	[diff] [blame]	521	inserting such blatant inefficiencies :).
Chris Lattner	d80f118	2019-04-07 13:14:23 +0000	[diff] [blame]	522
				523	After the rest of the optimizers run, we get:
				524
				525	.. code-block:: llvm
				526
				527	define double @fib(double %x) {
				528	entry:
				529	%cmptmp = fcmp ult double %x, 3.000000e+00
				530	%booltmp = uitofp i1 %cmptmp to double
				531	%ifcond = fcmp ueq double %booltmp, 0.000000e+00
				532	br i1 %ifcond, label %else, label %ifcont
				533
				534	else:
				535	%subtmp = fsub double %x, 1.000000e+00
				536	%calltmp = call double @fib(double %subtmp)
				537	%subtmp5 = fsub double %x, 2.000000e+00
				538	%calltmp6 = call double @fib(double %subtmp5)
				539	%addtmp = fadd double %calltmp, %calltmp6
				540	ret double %addtmp
				541
				542	ifcont:
				543	ret double 1.000000e+00
				544	}
				545
				546	Here we see that the simplifycfg pass decided to clone the return
				547	instruction into the end of the 'else' block. This allowed it to
				548	eliminate some branches and the PHI node.
				549
				550	Now that all symbol table references are updated to use stack variables,
				551	we'll add the assignment operator.
				552
				553	New Assignment Operator
				554	=======================
				555
				556	With our current framework, adding a new assignment operator is really
				557	simple. We will parse it just like any other binary operator, but handle
				558	it internally (instead of allowing the user to define it). The first
				559	step is to set a precedence:
				560
				561	.. code-block:: c++
				562
				563	int main() {
				564	// Install standard binary operators.
				565	// 1 is lowest precedence.
				566	BinopPrecedence['='] = 2;
				567	BinopPrecedence['<'] = 10;
				568	BinopPrecedence['+'] = 20;
				569	BinopPrecedence['-'] = 20;
				570
				571	Now that the parser knows the precedence of the binary operator, it
				572	takes care of all the parsing and AST generation. We just need to
				573	implement codegen for the assignment operator. This looks like:
				574
				575	.. code-block:: c++
				576
				577	Value *BinaryExprAST::codegen() {
				578	// Special case '=' because we don't want to emit the LHS as an expression.
				579	if (Op == '=') {
				580	// Assignment requires the LHS to be an identifier.
				581	VariableExprAST LHSE = dynamic_cast<VariableExprAST>(LHS.get());
				582	if (!LHSE)
				583	return LogErrorV("destination of '=' must be a variable");
				584
				585	Unlike the rest of the binary operators, our assignment operator doesn't
				586	follow the "emit LHS, emit RHS, do computation" model. As such, it is
				587	handled as a special case before the other binary operators are handled.
				588	The other strange thing is that it requires the LHS to be a variable. It
				589	is invalid to have "(x+1) = expr" - only things like "x = expr" are
				590	allowed.
				591
				592	.. code-block:: c++
				593
				594	// Codegen the RHS.
				595	Value *Val = RHS->codegen();
				596	if (!Val)
				597	return nullptr;
				598
				599	// Look up the name.
				600	Value *Variable = NamedValues[LHSE->getName()];
				601	if (!Variable)
				602	return LogErrorV("Unknown variable name");
				603
				604	Builder.CreateStore(Val, Variable);
				605	return Val;
				606	}
				607	...
				608
				609	Once we have the variable, codegen'ing the assignment is
				610	straightforward: we emit the RHS of the assignment, create a store, and
				611	return the computed value. Returning a value allows for chained
				612	assignments like "X = (Y = Z)".
				613
				614	Now that we have an assignment operator, we can mutate loop variables
				615	and arguments. For example, we can now run code like this:
				616
				617	::
				618
				619	# Function to print a double.
				620	extern printd(x);
				621
				622	# Define ':' for sequencing: as a low-precedence operator that ignores operands
				623	# and just returns the RHS.
				624	def binary : 1 (x y) y;
				625
				626	def test(x)
				627	printd(x) :
				628	x = 4 :
				629	printd(x);
				630
				631	test(123);
				632
				633	When run, this example prints "123" and then "4", showing that we did
				634	actually mutate the value! Okay, we have now officially implemented our
				635	goal: getting this to work requires SSA construction in the general
				636	case. However, to be really useful, we want the ability to define our
				637	own local variables, let's add this next!
				638
				639	User-defined Local Variables
				640	============================
				641
				642	Adding var/in is just like any other extension we made to
				643	Kaleidoscope: we extend the lexer, the parser, the AST and the code
				644	generator. The first step for adding our new 'var/in' construct is to
				645	extend the lexer. As before, this is pretty trivial, the code looks like
				646	this:
				647
				648	.. code-block:: c++
				649
				650	enum Token {
				651	...
				652	// var definition
				653	tok_var = -13
				654	...
				655	}
				656	...
				657	static int gettok() {
				658	...
				659	if (IdentifierStr == "in")
				660	return tok_in;
				661	if (IdentifierStr == "binary")
				662	return tok_binary;
				663	if (IdentifierStr == "unary")
				664	return tok_unary;
				665	if (IdentifierStr == "var")
				666	return tok_var;
				667	return tok_identifier;
				668	...
				669
				670	The next step is to define the AST node that we will construct. For
				671	var/in, it looks like this:
				672
				673	.. code-block:: c++
				674
				675	/// VarExprAST - Expression class for var/in
				676	class VarExprAST : public ExprAST {
				677	std::vector<std::pair<std::string, std::unique_ptr<ExprAST>>> VarNames;
				678	std::unique_ptr<ExprAST> Body;
				679
				680	public:
				681	VarExprAST(std::vector<std::pair<std::string, std::unique_ptr<ExprAST>>> VarNames,
				682	std::unique_ptr<ExprAST> Body)
				683	: VarNames(std::move(VarNames)), Body(std::move(Body)) {}
				684
				685	Value *codegen() override;
				686	};
				687
				688	var/in allows a list of names to be defined all at once, and each name
				689	can optionally have an initializer value. As such, we capture this
				690	information in the VarNames vector. Also, var/in has a body, this body
				691	is allowed to access the variables defined by the var/in.
				692
				693	With this in place, we can define the parser pieces. The first thing we
				694	do is add it as a primary expression:
				695
				696	.. code-block:: c++
				697
				698	/// primary
				699	/// ::= identifierexpr
				700	/// ::= numberexpr
				701	/// ::= parenexpr
				702	/// ::= ifexpr
				703	/// ::= forexpr
				704	/// ::= varexpr
				705	static std::unique_ptr<ExprAST> ParsePrimary() {
				706	switch (CurTok) {
				707	default:
				708	return LogError("unknown token when expecting an expression");
				709	case tok_identifier:
				710	return ParseIdentifierExpr();
				711	case tok_number:
				712	return ParseNumberExpr();
				713	case '(':
				714	return ParseParenExpr();
				715	case tok_if:
				716	return ParseIfExpr();
				717	case tok_for:
				718	return ParseForExpr();
				719	case tok_var:
				720	return ParseVarExpr();
				721	}
				722	}
				723
				724	Next we define ParseVarExpr:
				725
				726	.. code-block:: c++
				727
				728	/// varexpr ::= 'var' identifier ('=' expression)?
				729	// (',' identifier ('=' expression)?)* 'in' expression
				730	static std::unique_ptr<ExprAST> ParseVarExpr() {
				731	getNextToken(); // eat the var.
				732
				733	std::vector<std::pair<std::string, std::unique_ptr<ExprAST>>> VarNames;
				734
				735	// At least one variable name is required.
				736	if (CurTok != tok_identifier)
				737	return LogError("expected identifier after var");
				738
				739	The first part of this code parses the list of identifier/expr pairs
				740	into the local ``VarNames`` vector.
				741
				742	.. code-block:: c++
				743
				744	while (1) {
				745	std::string Name = IdentifierStr;
				746	getNextToken(); // eat identifier.
				747
				748	// Read the optional initializer.
				749	std::unique_ptr<ExprAST> Init;
				750	if (CurTok == '=') {
				751	getNextToken(); // eat the '='.
				752
				753	Init = ParseExpression();
				754	if (!Init) return nullptr;
				755	}
				756
				757	VarNames.push_back(std::make_pair(Name, std::move(Init)));
				758
				759	// End of var list, exit loop.
				760	if (CurTok != ',') break;
				761	getNextToken(); // eat the ','.
				762
				763	if (CurTok != tok_identifier)
				764	return LogError("expected identifier list after var");
				765	}
				766
				767	Once all the variables are parsed, we then parse the body and create the
				768	AST node:
				769
				770	.. code-block:: c++
				771
				772	// At this point, we have to have 'in'.
				773	if (CurTok != tok_in)
				774	return LogError("expected 'in' keyword after 'var'");
				775	getNextToken(); // eat 'in'.
				776
				777	auto Body = ParseExpression();
				778	if (!Body)
				779	return nullptr;
				780
Jonas Devlieghere	0eaee54	2019-08-15 15:54:37 +0000	[diff] [blame]	781	return std::make_unique<VarExprAST>(std::move(VarNames),
Chris Lattner	d80f118	2019-04-07 13:14:23 +0000	[diff] [blame]	782	std::move(Body));
				783	}
				784
				785	Now that we can parse and represent the code, we need to support
				786	emission of LLVM IR for it. This code starts out with:
				787
				788	.. code-block:: c++
				789
				790	Value *VarExprAST::codegen() {
				791	std::vector<AllocaInst *> OldBindings;
				792
				793	Function *TheFunction = Builder.GetInsertBlock()->getParent();
				794
				795	// Register all variables and emit their initializer.
				796	for (unsigned i = 0, e = VarNames.size(); i != e; ++i) {
				797	const std::string &VarName = VarNames[i].first;
				798	ExprAST *Init = VarNames[i].second.get();
				799
				800	Basically it loops over all the variables, installing them one at a
				801	time. For each variable we put into the symbol table, we remember the
				802	previous value that we replace in OldBindings.
				803
				804	.. code-block:: c++
				805
				806	// Emit the initializer before adding the variable to scope, this prevents
				807	// the initializer from referencing the variable itself, and permits stuff
				808	// like this:
				809	// var a = 1 in
				810	// var a = a in ... # refers to outer 'a'.
				811	Value *InitVal;
				812	if (Init) {
				813	InitVal = Init->codegen();
				814	if (!InitVal)
				815	return nullptr;
				816	} else { // If not specified, use 0.0.
				817	InitVal = ConstantFP::get(TheContext, APFloat(0.0));
				818	}
				819
				820	AllocaInst *Alloca = CreateEntryBlockAlloca(TheFunction, VarName);
				821	Builder.CreateStore(InitVal, Alloca);
				822
				823	// Remember the old variable binding so that we can restore the binding when
				824	// we unrecurse.
				825	OldBindings.push_back(NamedValues[VarName]);
				826
				827	// Remember this binding.
				828	NamedValues[VarName] = Alloca;
				829	}
				830
				831	There are more comments here than code. The basic idea is that we emit
				832	the initializer, create the alloca, then update the symbol table to
				833	point to it. Once all the variables are installed in the symbol table,
				834	we evaluate the body of the var/in expression:
				835
				836	.. code-block:: c++
				837
				838	// Codegen the body, now that all vars are in scope.
				839	Value *BodyVal = Body->codegen();
				840	if (!BodyVal)
				841	return nullptr;
				842
				843	Finally, before returning, we restore the previous variable bindings:
				844
				845	.. code-block:: c++
				846
				847	// Pop all our variables from scope.
				848	for (unsigned i = 0, e = VarNames.size(); i != e; ++i)
				849	NamedValues[VarNames[i].first] = OldBindings[i];
				850
				851	// Return the body computation.
				852	return BodyVal;
				853	}
				854
				855	The end result of all of this is that we get properly scoped variable
				856	definitions, and we even (trivially) allow mutation of them :).
				857
				858	With this, we completed what we set out to do. Our nice iterative fib
				859	example from the intro compiles and runs just fine. The mem2reg pass
				860	optimizes all of our stack variables into SSA registers, inserting PHI
				861	nodes where needed, and our front-end remains simple: no "iterated
				862	dominance frontier" computation anywhere in sight.
				863
				864	Full Code Listing
				865	=================
				866
				867	Here is the complete code listing for our running example, enhanced with
				868	mutable variables and var/in support. To build this example, use:
				869
				870	.. code-block:: bash
				871
				872	# Compile
				873	clang++ -g toy.cpp `llvm-config --cxxflags --ldflags --system-libs --libs core mcjit native` -O3 -o toy
				874	# Run
				875	./toy
				876
				877	Here is the code:
				878
Hans Wennborg	147e0dd	2019-04-11 07:30:56 +0000	[diff] [blame]	879	.. literalinclude:: ../../../examples/Kaleidoscope/Chapter7/toy.cpp
Chris Lattner	d80f118	2019-04-07 13:14:23 +0000	[diff] [blame]	880	:language: c++
				881
				882	`Next: Compiling to Object Code <LangImpl08.html>`_
				883