Blame - llvm/docs/tutorial/MyFirstLanguageFrontend/LangImpl07.rst - toolchain/llvm-project

blob: 31e2ffb16907d9ba2a5965e10952f35f25c33e49 [file] [log] [blame]

Hans Wennborg	74e4f8a	2019-04-11 07:46:25 +0000	[diff] [blame]	1	:orphan:
				2
Chris Lattner	d80f118	2019-04-07 13:14:23 +0000	[diff] [blame]	3	=======================================================
				4	Kaleidoscope: Extending the Language: Mutable Variables
				5	=======================================================
				6
				7	.. contents::
				8	:local:
				9
				10	Chapter 7 Introduction
				11	======================
				12
				13	Welcome to Chapter 7 of the "`Implementing a language with
				14	LLVM <index.html>`_" tutorial. In chapters 1 through 6, we've built a
				15	very respectable, albeit simple, `functional programming
				16	language <http://en.wikipedia.org/wiki/Functional_programming>`_. In our
				17	journey, we learned some parsing techniques, how to build and represent
				18	an AST, how to build LLVM IR, and how to optimize the resultant code as
				19	well as JIT compile it.
				20
				21	While Kaleidoscope is interesting as a functional language, the fact
				22	that it is functional makes it "too easy" to generate LLVM IR for it. In
				23	particular, a functional language makes it very easy to build LLVM IR
				24	directly in `SSA
				25	form <http://en.wikipedia.org/wiki/Static_single_assignment_form>`_.
				26	Since LLVM requires that the input code be in SSA form, this is a very
				27	nice property and it is often unclear to newcomers how to generate code
				28	for an imperative language with mutable variables.
				29
				30	The short (and happy) summary of this chapter is that there is no need
				31	for your front-end to build SSA form: LLVM provides highly tuned and
				32	well tested support for this, though the way it works is a bit
				33	unexpected for some.
				34
				35	Why is this a hard problem?
				36	===========================
				37
				38	To understand why mutable variables cause complexities in SSA
				39	construction, consider this extremely simple C example:
				40
				41	.. code-block:: c
				42
				43	int G, H;
				44	int test(_Bool Condition) {
				45	int X;
				46	if (Condition)
				47	X = G;
				48	else
				49	X = H;
				50	return X;
				51	}
				52
				53	In this case, we have the variable "X", whose value depends on the path
				54	executed in the program. Because there are two different possible values
				55	for X before the return instruction, a PHI node is inserted to merge the
				56	two values. The LLVM IR that we want for this example looks like this:
				57
				58	.. code-block:: llvm
				59
				60	@G = weak global i32 0 ; type of @G is i32*
				61	@H = weak global i32 0 ; type of @H is i32*
				62
				63	define i32 @test(i1 %Condition) {
				64	entry:
				65	br i1 %Condition, label %cond_true, label %cond_false
				66
				67	cond_true:
				68	%X.0 = load i32* @G
				69	br label %cond_next
				70
				71	cond_false:
				72	%X.1 = load i32* @H
				73	br label %cond_next
				74
				75	cond_next:
				76	%X.2 = phi i32 [ %X.1, %cond_false ], [ %X.0, %cond_true ]
				77	ret i32 %X.2
				78	}
				79
				80	In this example, the loads from the G and H global variables are
				81	explicit in the LLVM IR, and they live in the then/else branches of the
				82	if statement (cond\_true/cond\_false). In order to merge the incoming
				83	values, the X.2 phi node in the cond\_next block selects the right value
				84	to use based on where control flow is coming from: if control flow comes
				85	from the cond\_false block, X.2 gets the value of X.1. Alternatively, if
				86	control flow comes from cond\_true, it gets the value of X.0. The intent
				87	of this chapter is not to explain the details of SSA form. For more
				88	information, see one of the many `online
				89	references <http://en.wikipedia.org/wiki/Static_single_assignment_form>`_.
				90
				91	The question for this article is "who places the phi nodes when lowering
				92	assignments to mutable variables?". The issue here is that LLVM
				93	requires that its IR be in SSA form: there is no "non-ssa" mode for
				94	it. However, SSA construction requires non-trivial algorithms and data
				95	structures, so it is inconvenient and wasteful for every front-end to
				96	have to reproduce this logic.
				97
				98	Memory in LLVM
				99	==============
				100
				101	The 'trick' here is that while LLVM does require all register values to
				102	be in SSA form, it does not require (or permit) memory objects to be in
				103	SSA form. In the example above, note that the loads from G and H are
				104	direct accesses to G and H: they are not renamed or versioned. This
				105	differs from some other compiler systems, which do try to version memory
				106	objects. In LLVM, instead of encoding dataflow analysis of memory into
				107	the LLVM IR, it is handled with `Analysis
				108	Passes <../WritingAnLLVMPass.html>`_ which are computed on demand.
				109
				110	With this in mind, the high-level idea is that we want to make a stack
				111	variable (which lives in memory, because it is on the stack) for each
				112	mutable object in a function. To take advantage of this trick, we need
				113	to talk about how LLVM represents stack variables.
				114
				115	In LLVM, all memory accesses are explicit with load/store instructions,
				116	and it is carefully designed not to have (or need) an "address-of"
				117	operator. Notice how the type of the @G/@H global variables is actually
				118	"i32\*" even though the variable is defined as "i32". What this means is
				119	that @G defines space for an i32 in the global data area, but its
				120	name actually refers to the address for that space. Stack variables
				121	work the same way, except that instead of being declared with global
				122	variable definitions, they are declared with the `LLVM alloca
				123	instruction <../LangRef.html#alloca-instruction>`_:
				124
				125	.. code-block:: llvm
				126
				127	define i32 @example() {
				128	entry:
				129	%X = alloca i32 ; type of %X is i32*.
				130	...
				131	%tmp = load i32* %X ; load the stack value %X from the stack.
				132	%tmp2 = add i32 %tmp, 1 ; increment it
				133	store i32 %tmp2, i32* %X ; store it back
				134	...
				135
				136	This code shows an example of how you can declare and manipulate a stack
				137	variable in the LLVM IR. Stack memory allocated with the alloca
				138	instruction is fully general: you can pass the address of the stack slot
				139	to functions, you can store it in other variables, etc. In our example
				140	above, we could rewrite the example to use the alloca technique to avoid
				141	using a PHI node:
				142
				143	.. code-block:: llvm
				144
				145	@G = weak global i32 0 ; type of @G is i32*
				146	@H = weak global i32 0 ; type of @H is i32*
				147
				148	define i32 @test(i1 %Condition) {
				149	entry:
				150	%X = alloca i32 ; type of %X is i32*.
				151	br i1 %Condition, label %cond_true, label %cond_false
				152
				153	cond_true:
				154	%X.0 = load i32* @G
				155	store i32 %X.0, i32* %X ; Update X
				156	br label %cond_next
				157
				158	cond_false:
				159	%X.1 = load i32* @H
				160	store i32 %X.1, i32* %X ; Update X
				161	br label %cond_next
				162
				163	cond_next:
				164	%X.2 = load i32* %X ; Read X
				165	ret i32 %X.2
				166	}
				167
				168	With this, we have discovered a way to handle arbitrary mutable
				169	variables without the need to create Phi nodes at all:
				170
				171	#. Each mutable variable becomes a stack allocation.
				172	#. Each read of the variable becomes a load from the stack.
				173	#. Each update of the variable becomes a store to the stack.
				174	#. Taking the address of a variable just uses the stack address
				175	directly.
				176
				177	While this solution has solved our immediate problem, it introduced
				178	another one: we have now apparently introduced a lot of stack traffic
				179	for very simple and common operations, a major performance problem.
				180	Fortunately for us, the LLVM optimizer has a highly-tuned optimization
				181	pass named "mem2reg" that handles this case, promoting allocas like this
				182	into SSA registers, inserting Phi nodes as appropriate. If you run this
				183	example through the pass, for example, you'll get:
				184
				185	.. code-block:: bash
				186
				187	$ llvm-as < example.ll \| opt -mem2reg \| llvm-dis
				188	@G = weak global i32 0
				189	@H = weak global i32 0
				190
				191	define i32 @test(i1 %Condition) {
				192	entry:
				193	br i1 %Condition, label %cond_true, label %cond_false
				194
				195	cond_true:
				196	%X.0 = load i32* @G
				197	br label %cond_next
				198
				199	cond_false:
				200	%X.1 = load i32* @H
				201	br label %cond_next
				202
				203	cond_next:
				204	%X.01 = phi i32 [ %X.1, %cond_false ], [ %X.0, %cond_true ]
				205	ret i32 %X.01
				206	}
				207
				208	The mem2reg pass implements the standard "iterated dominance frontier"
				209	algorithm for constructing SSA form and has a number of optimizations
				210	that speed up (very common) degenerate cases. The mem2reg optimization
				211	pass is the answer to dealing with mutable variables, and we highly
				212	recommend that you depend on it. Note that mem2reg only works on
				213	variables in certain circumstances:
				214
				215	#. mem2reg is alloca-driven: it looks for allocas and if it can handle
				216	them, it promotes them. It does not apply to global variables or heap
				217	allocations.
				218	#. mem2reg only looks for alloca instructions in the entry block of the
				219	function. Being in the entry block guarantees that the alloca is only
				220	executed once, which makes analysis simpler.
				221	#. mem2reg only promotes allocas whose uses are direct loads and stores.
				222	If the address of the stack object is passed to a function, or if any
				223	funny pointer arithmetic is involved, the alloca will not be
				224	promoted.
				225	#. mem2reg only works on allocas of `first
				226	class <../LangRef.html#first-class-types>`_ values (such as pointers,
				227	scalars and vectors), and only if the array size of the allocation is
				228	1 (or missing in the .ll file). mem2reg is not capable of promoting
				229	structs or arrays to registers. Note that the "sroa" pass is
				230	more powerful and can promote structs, "unions", and arrays in many
				231	cases.
				232
				233	All of these properties are easy to satisfy for most imperative
				234	languages, and we'll illustrate it below with Kaleidoscope. The final
				235	question you may be asking is: should I bother with this nonsense for my
				236	front-end? Wouldn't it be better if I just did SSA construction
				237	directly, avoiding use of the mem2reg optimization pass? In short, we
				238	strongly recommend that you use this technique for building SSA form,
				239	unless there is an extremely good reason not to. Using this technique
				240	is:
				241
				242	- Proven and well tested: clang uses this technique
				243	for local mutable variables. As such, the most common clients of LLVM
				244	are using this to handle a bulk of their variables. You can be sure
				245	that bugs are found fast and fixed early.
				246	- Extremely Fast: mem2reg has a number of special cases that make it
				247	fast in common cases as well as fully general. For example, it has
				248	fast-paths for variables that are only used in a single block,
				249	variables that only have one assignment point, good heuristics to
				250	avoid insertion of unneeded phi nodes, etc.
				251	- Needed for debug info generation: `Debug information in
				252	LLVM <../SourceLevelDebugging.html>`_ relies on having the address of
				253	the variable exposed so that debug info can be attached to it. This
				254	technique dovetails very naturally with this style of debug info.
				255
				256	If nothing else, this makes it much easier to get your front-end up and
				257	running, and is very simple to implement. Let's extend Kaleidoscope with
				258	mutable variables now!
				259
				260	Mutable Variables in Kaleidoscope
				261	=================================
				262
				263	Now that we know the sort of problem we want to tackle, let's see what
				264	this looks like in the context of our little Kaleidoscope language.
				265	We're going to add two features:
				266
				267	#. The ability to mutate variables with the '=' operator.
				268	#. The ability to define new variables.
				269
				270	While the first item is really what this is about, we only have
				271	variables for incoming arguments as well as for induction variables, and
				272	redefining those only goes so far :). Also, the ability to define new
				273	variables is a useful thing regardless of whether you will be mutating
				274	them. Here's a motivating example that shows how we could use these:
				275
				276	::
				277
				278	# Define ':' for sequencing: as a low-precedence operator that ignores operands
				279	# and just returns the RHS.
				280	def binary : 1 (x y) y;
				281
				282	# Recursive fib, we could do this before.
				283	def fib(x)
				284	if (x < 3) then
				285	1
				286	else
				287	fib(x-1)+fib(x-2);
				288
				289	# Iterative fib.
				290	def fibi(x)
				291	var a = 1, b = 1, c in
				292	(for i = 3, i < x in
				293	c = a + b :
				294	a = b :
				295	b = c) :
				296	b;
				297
				298	# Call it.
				299	fibi(10);
				300
				301	In order to mutate variables, we have to change our existing variables
				302	to use the "alloca trick". Once we have that, we'll add our new
				303	operator, then extend Kaleidoscope to support new variable definitions.
				304
				305	Adjusting Existing Variables for Mutation
				306	=========================================
				307
				308	The symbol table in Kaleidoscope is managed at code generation time by
				309	the '``NamedValues``' map. This map currently keeps track of the LLVM
				310	"Value\*" that holds the double value for the named variable. In order
				311	to support mutation, we need to change this slightly, so that
				312	``NamedValues`` holds the memory location of the variable in question.
				313	Note that this change is a refactoring: it changes the structure of the
				314	code, but does not (by itself) change the behavior of the compiler. All
				315	of these changes are isolated in the Kaleidoscope code generator.
				316
				317	At this point in Kaleidoscope's development, it only supports variables
				318	for two things: incoming arguments to functions and the induction
				319	variable of 'for' loops. For consistency, we'll allow mutation of these
				320	variables in addition to other user-defined variables. This means that
				321	these will both need memory locations.
				322
				323	To start our transformation of Kaleidoscope, we'll change the
				324	NamedValues map so that it maps to AllocaInst\* instead of Value\*. Once
				325	we do this, the C++ compiler will tell us what parts of the code we need
				326	to update:
				327
				328	.. code-block:: c++
				329
				330	static std::map<std::string, AllocaInst*> NamedValues;
				331
				332	Also, since we will need to create these allocas, we'll use a helper
				333	function that ensures that the allocas are created in the entry block of
				334	the function:
				335
				336	.. code-block:: c++
				337
				338	/// CreateEntryBlockAlloca - Create an alloca instruction in the entry block of
				339	/// the function. This is used for mutable variables etc.
				340	static AllocaInst CreateEntryBlockAlloca(Function TheFunction,
				341	const std::string &VarName) {
				342	IRBuilder<> TmpB(&TheFunction->getEntryBlock(),
				343	TheFunction->getEntryBlock().begin());
				344	return TmpB.CreateAlloca(Type::getDoubleTy(TheContext), 0,
				345	VarName.c_str());
				346	}
				347
				348	This funny looking code creates an IRBuilder object that is pointing at
				349	the first instruction (.begin()) of the entry block. It then creates an
				350	alloca with the expected name and returns it. Because all values in
				351	Kaleidoscope are doubles, there is no need to pass in a type to use.
				352
				353	With this in place, the first functionality change we want to make belongs to
				354	variable references. In our new scheme, variables live on the stack, so
				355	code generating a reference to them actually needs to produce a load
				356	from the stack slot:
				357
				358	.. code-block:: c++
				359
				360	Value *VariableExprAST::codegen() {
				361	// Look this variable up in the function.
				362	Value *V = NamedValues[Name];
				363	if (!V)
				364	return LogErrorV("Unknown variable name");
				365
				366	// Load the value.
				367	return Builder.CreateLoad(V, Name.c_str());
				368	}
				369
				370	As you can see, this is pretty straightforward. Now we need to update
				371	the things that define the variables to set up the alloca. We'll start
				372	with ``ForExprAST::codegen()`` (see the `full code listing <#id1>`_ for
				373	the unabridged code):
				374
				375	.. code-block:: c++
				376
				377	Function *TheFunction = Builder.GetInsertBlock()->getParent();
				378
				379	// Create an alloca for the variable in the entry block.
				380	AllocaInst *Alloca = CreateEntryBlockAlloca(TheFunction, VarName);
				381
				382	// Emit the start code first, without 'variable' in scope.
				383	Value *StartVal = Start->codegen();
				384	if (!StartVal)
				385	return nullptr;
				386
				387	// Store the value into the alloca.
				388	Builder.CreateStore(StartVal, Alloca);
				389	...
				390
				391	// Compute the end condition.
				392	Value *EndCond = End->codegen();
				393	if (!EndCond)
				394	return nullptr;
				395
				396	// Reload, increment, and restore the alloca. This handles the case where
				397	// the body of the loop mutates the variable.
				398	Value *CurVar = Builder.CreateLoad(Alloca);
				399	Value *NextVar = Builder.CreateFAdd(CurVar, StepVal, "nextvar");
				400	Builder.CreateStore(NextVar, Alloca);
				401	...
				402
				403	This code is virtually identical to the code `before we allowed mutable
				404	variables <LangImpl5.html#code-generation-for-the-for-loop>`_. The big difference is that we
				405	no longer have to construct a PHI node, and we use load/store to access
				406	the variable as needed.
				407
				408	To support mutable argument variables, we need to also make allocas for
				409	them. The code for this is also pretty simple:
				410
				411	.. code-block:: c++
				412
				413	Function *FunctionAST::codegen() {
				414	...
				415	Builder.SetInsertPoint(BB);
				416
				417	// Record the function arguments in the NamedValues map.
				418	NamedValues.clear();
				419	for (auto &Arg : TheFunction->args()) {
				420	// Create an alloca for this variable.
				421	AllocaInst *Alloca = CreateEntryBlockAlloca(TheFunction, Arg.getName());
				422
				423	// Store the initial value into the alloca.
				424	Builder.CreateStore(&Arg, Alloca);
				425
				426	// Add arguments to variable symbol table.
				427	NamedValues[Arg.getName()] = Alloca;
				428	}
				429
				430	if (Value *RetVal = Body->codegen()) {
				431	...
				432
				433	For each argument, we make an alloca, store the input value to the
				434	function into the alloca, and register the alloca as the memory location
				435	for the argument. This method gets invoked by ``FunctionAST::codegen()``
				436	right after it sets up the entry block for the function.
				437
				438	The final missing piece is adding the mem2reg pass, which allows us to
				439	get good codegen once again:
				440
				441	.. code-block:: c++
				442
				443	// Promote allocas to registers.
				444	TheFPM->add(createPromoteMemoryToRegisterPass());
				445	// Do simple "peephole" optimizations and bit-twiddling optzns.
				446	TheFPM->add(createInstructionCombiningPass());
				447	// Reassociate expressions.
				448	TheFPM->add(createReassociatePass());
				449	...
				450
				451	It is interesting to see what the code looks like before and after the
				452	mem2reg optimization runs. For example, this is the before/after code
				453	for our recursive fib function. Before the optimization:
				454
				455	.. code-block:: llvm
				456
				457	define double @fib(double %x) {
				458	entry:
				459	%x1 = alloca double
				460	store double %x, double* %x1
				461	%x2 = load double, double* %x1
				462	%cmptmp = fcmp ult double %x2, 3.000000e+00
				463	%booltmp = uitofp i1 %cmptmp to double
				464	%ifcond = fcmp one double %booltmp, 0.000000e+00
				465	br i1 %ifcond, label %then, label %else
				466
				467	then: ; preds = %entry
				468	br label %ifcont
				469
				470	else: ; preds = %entry
				471	%x3 = load double, double* %x1
				472	%subtmp = fsub double %x3, 1.000000e+00
				473	%calltmp = call double @fib(double %subtmp)
				474	%x4 = load double, double* %x1
				475	%subtmp5 = fsub double %x4, 2.000000e+00
				476	%calltmp6 = call double @fib(double %subtmp5)
				477	%addtmp = fadd double %calltmp, %calltmp6
				478	br label %ifcont
				479
				480	ifcont: ; preds = %else, %then
				481	%iftmp = phi double [ 1.000000e+00, %then ], [ %addtmp, %else ]
				482	ret double %iftmp
				483	}
				484
				485	Here there is only one variable (x, the input argument) but you can
				486	still see the extremely simple-minded code generation strategy we are
				487	using. In the entry block, an alloca is created, and the initial input
				488	value is stored into it. Each reference to the variable does a reload
				489	from the stack. Also, note that we didn't modify the if/then/else
				490	expression, so it still inserts a PHI node. While we could make an
				491	alloca for it, it is actually easier to create a PHI node for it, so we
				492	still just make the PHI.
				493
				494	Here is the code after the mem2reg pass runs:
				495
				496	.. code-block:: llvm
				497
				498	define double @fib(double %x) {
				499	entry:
				500	%cmptmp = fcmp ult double %x, 3.000000e+00
				501	%booltmp = uitofp i1 %cmptmp to double
				502	%ifcond = fcmp one double %booltmp, 0.000000e+00
				503	br i1 %ifcond, label %then, label %else
				504
				505	then:
				506	br label %ifcont
				507
				508	else:
				509	%subtmp = fsub double %x, 1.000000e+00
				510	%calltmp = call double @fib(double %subtmp)
				511	%subtmp5 = fsub double %x, 2.000000e+00
				512	%calltmp6 = call double @fib(double %subtmp5)
				513	%addtmp = fadd double %calltmp, %calltmp6
				514	br label %ifcont
				515
				516	ifcont: ; preds = %else, %then
				517	%iftmp = phi double [ 1.000000e+00, %then ], [ %addtmp, %else ]
				518	ret double %iftmp
				519	}
				520
				521	This is a trivial case for mem2reg, since there are no redefinitions of
				522	the variable. The point of showing this is to calm your tension about
Nico Weber	bb69208	2019-09-13 14:58:24 +0000	[diff] [blame^]	523	inserting such blatant inefficiencies :).
Chris Lattner	d80f118	2019-04-07 13:14:23 +0000	[diff] [blame]	524
				525	After the rest of the optimizers run, we get:
				526
				527	.. code-block:: llvm
				528
				529	define double @fib(double %x) {
				530	entry:
				531	%cmptmp = fcmp ult double %x, 3.000000e+00
				532	%booltmp = uitofp i1 %cmptmp to double
				533	%ifcond = fcmp ueq double %booltmp, 0.000000e+00
				534	br i1 %ifcond, label %else, label %ifcont
				535
				536	else:
				537	%subtmp = fsub double %x, 1.000000e+00
				538	%calltmp = call double @fib(double %subtmp)
				539	%subtmp5 = fsub double %x, 2.000000e+00
				540	%calltmp6 = call double @fib(double %subtmp5)
				541	%addtmp = fadd double %calltmp, %calltmp6
				542	ret double %addtmp
				543
				544	ifcont:
				545	ret double 1.000000e+00
				546	}
				547
				548	Here we see that the simplifycfg pass decided to clone the return
				549	instruction into the end of the 'else' block. This allowed it to
				550	eliminate some branches and the PHI node.
				551
				552	Now that all symbol table references are updated to use stack variables,
				553	we'll add the assignment operator.
				554
				555	New Assignment Operator
				556	=======================
				557
				558	With our current framework, adding a new assignment operator is really
				559	simple. We will parse it just like any other binary operator, but handle
				560	it internally (instead of allowing the user to define it). The first
				561	step is to set a precedence:
				562
				563	.. code-block:: c++
				564
				565	int main() {
				566	// Install standard binary operators.
				567	// 1 is lowest precedence.
				568	BinopPrecedence['='] = 2;
				569	BinopPrecedence['<'] = 10;
				570	BinopPrecedence['+'] = 20;
				571	BinopPrecedence['-'] = 20;
				572
				573	Now that the parser knows the precedence of the binary operator, it
				574	takes care of all the parsing and AST generation. We just need to
				575	implement codegen for the assignment operator. This looks like:
				576
				577	.. code-block:: c++
				578
				579	Value *BinaryExprAST::codegen() {
				580	// Special case '=' because we don't want to emit the LHS as an expression.
				581	if (Op == '=') {
				582	// Assignment requires the LHS to be an identifier.
				583	VariableExprAST LHSE = dynamic_cast<VariableExprAST>(LHS.get());
				584	if (!LHSE)
				585	return LogErrorV("destination of '=' must be a variable");
				586
				587	Unlike the rest of the binary operators, our assignment operator doesn't
				588	follow the "emit LHS, emit RHS, do computation" model. As such, it is
				589	handled as a special case before the other binary operators are handled.
				590	The other strange thing is that it requires the LHS to be a variable. It
				591	is invalid to have "(x+1) = expr" - only things like "x = expr" are
				592	allowed.
				593
				594	.. code-block:: c++
				595
				596	// Codegen the RHS.
				597	Value *Val = RHS->codegen();
				598	if (!Val)
				599	return nullptr;
				600
				601	// Look up the name.
				602	Value *Variable = NamedValues[LHSE->getName()];
				603	if (!Variable)
				604	return LogErrorV("Unknown variable name");
				605
				606	Builder.CreateStore(Val, Variable);
				607	return Val;
				608	}
				609	...
				610
				611	Once we have the variable, codegen'ing the assignment is
				612	straightforward: we emit the RHS of the assignment, create a store, and
				613	return the computed value. Returning a value allows for chained
				614	assignments like "X = (Y = Z)".
				615
				616	Now that we have an assignment operator, we can mutate loop variables
				617	and arguments. For example, we can now run code like this:
				618
				619	::
				620
				621	# Function to print a double.
				622	extern printd(x);
				623
				624	# Define ':' for sequencing: as a low-precedence operator that ignores operands
				625	# and just returns the RHS.
				626	def binary : 1 (x y) y;
				627
				628	def test(x)
				629	printd(x) :
				630	x = 4 :
				631	printd(x);
				632
				633	test(123);
				634
				635	When run, this example prints "123" and then "4", showing that we did
				636	actually mutate the value! Okay, we have now officially implemented our
				637	goal: getting this to work requires SSA construction in the general
				638	case. However, to be really useful, we want the ability to define our
				639	own local variables, let's add this next!
				640
				641	User-defined Local Variables
				642	============================
				643
				644	Adding var/in is just like any other extension we made to
				645	Kaleidoscope: we extend the lexer, the parser, the AST and the code
				646	generator. The first step for adding our new 'var/in' construct is to
				647	extend the lexer. As before, this is pretty trivial, the code looks like
				648	this:
				649
				650	.. code-block:: c++
				651
				652	enum Token {
				653	...
				654	// var definition
				655	tok_var = -13
				656	...
				657	}
				658	...
				659	static int gettok() {
				660	...
				661	if (IdentifierStr == "in")
				662	return tok_in;
				663	if (IdentifierStr == "binary")
				664	return tok_binary;
				665	if (IdentifierStr == "unary")
				666	return tok_unary;
				667	if (IdentifierStr == "var")
				668	return tok_var;
				669	return tok_identifier;
				670	...
				671
				672	The next step is to define the AST node that we will construct. For
				673	var/in, it looks like this:
				674
				675	.. code-block:: c++
				676
				677	/// VarExprAST - Expression class for var/in
				678	class VarExprAST : public ExprAST {
				679	std::vector<std::pair<std::string, std::unique_ptr<ExprAST>>> VarNames;
				680	std::unique_ptr<ExprAST> Body;
				681
				682	public:
				683	VarExprAST(std::vector<std::pair<std::string, std::unique_ptr<ExprAST>>> VarNames,
				684	std::unique_ptr<ExprAST> Body)
				685	: VarNames(std::move(VarNames)), Body(std::move(Body)) {}
				686
				687	Value *codegen() override;
				688	};
				689
				690	var/in allows a list of names to be defined all at once, and each name
				691	can optionally have an initializer value. As such, we capture this
				692	information in the VarNames vector. Also, var/in has a body, this body
				693	is allowed to access the variables defined by the var/in.
				694
				695	With this in place, we can define the parser pieces. The first thing we
				696	do is add it as a primary expression:
				697
				698	.. code-block:: c++
				699
				700	/// primary
				701	/// ::= identifierexpr
				702	/// ::= numberexpr
				703	/// ::= parenexpr
				704	/// ::= ifexpr
				705	/// ::= forexpr
				706	/// ::= varexpr
				707	static std::unique_ptr<ExprAST> ParsePrimary() {
				708	switch (CurTok) {
				709	default:
				710	return LogError("unknown token when expecting an expression");
				711	case tok_identifier:
				712	return ParseIdentifierExpr();
				713	case tok_number:
				714	return ParseNumberExpr();
				715	case '(':
				716	return ParseParenExpr();
				717	case tok_if:
				718	return ParseIfExpr();
				719	case tok_for:
				720	return ParseForExpr();
				721	case tok_var:
				722	return ParseVarExpr();
				723	}
				724	}
				725
				726	Next we define ParseVarExpr:
				727
				728	.. code-block:: c++
				729
				730	/// varexpr ::= 'var' identifier ('=' expression)?
				731	// (',' identifier ('=' expression)?)* 'in' expression
				732	static std::unique_ptr<ExprAST> ParseVarExpr() {
				733	getNextToken(); // eat the var.
				734
				735	std::vector<std::pair<std::string, std::unique_ptr<ExprAST>>> VarNames;
				736
				737	// At least one variable name is required.
				738	if (CurTok != tok_identifier)
				739	return LogError("expected identifier after var");
				740
				741	The first part of this code parses the list of identifier/expr pairs
				742	into the local ``VarNames`` vector.
				743
				744	.. code-block:: c++
				745
				746	while (1) {
				747	std::string Name = IdentifierStr;
				748	getNextToken(); // eat identifier.
				749
				750	// Read the optional initializer.
				751	std::unique_ptr<ExprAST> Init;
				752	if (CurTok == '=') {
				753	getNextToken(); // eat the '='.
				754
				755	Init = ParseExpression();
				756	if (!Init) return nullptr;
				757	}
				758
				759	VarNames.push_back(std::make_pair(Name, std::move(Init)));
				760
				761	// End of var list, exit loop.
				762	if (CurTok != ',') break;
				763	getNextToken(); // eat the ','.
				764
				765	if (CurTok != tok_identifier)
				766	return LogError("expected identifier list after var");
				767	}
				768
				769	Once all the variables are parsed, we then parse the body and create the
				770	AST node:
				771
				772	.. code-block:: c++
				773
				774	// At this point, we have to have 'in'.
				775	if (CurTok != tok_in)
				776	return LogError("expected 'in' keyword after 'var'");
				777	getNextToken(); // eat 'in'.
				778
				779	auto Body = ParseExpression();
				780	if (!Body)
				781	return nullptr;
				782
Jonas Devlieghere	0eaee54	2019-08-15 15:54:37 +0000	[diff] [blame]	783	return std::make_unique<VarExprAST>(std::move(VarNames),
Chris Lattner	d80f118	2019-04-07 13:14:23 +0000	[diff] [blame]	784	std::move(Body));
				785	}
				786
				787	Now that we can parse and represent the code, we need to support
				788	emission of LLVM IR for it. This code starts out with:
				789
				790	.. code-block:: c++
				791
				792	Value *VarExprAST::codegen() {
				793	std::vector<AllocaInst *> OldBindings;
				794
				795	Function *TheFunction = Builder.GetInsertBlock()->getParent();
				796
				797	// Register all variables and emit their initializer.
				798	for (unsigned i = 0, e = VarNames.size(); i != e; ++i) {
				799	const std::string &VarName = VarNames[i].first;
				800	ExprAST *Init = VarNames[i].second.get();
				801
				802	Basically it loops over all the variables, installing them one at a
				803	time. For each variable we put into the symbol table, we remember the
				804	previous value that we replace in OldBindings.
				805
				806	.. code-block:: c++
				807
				808	// Emit the initializer before adding the variable to scope, this prevents
				809	// the initializer from referencing the variable itself, and permits stuff
				810	// like this:
				811	// var a = 1 in
				812	// var a = a in ... # refers to outer 'a'.
				813	Value *InitVal;
				814	if (Init) {
				815	InitVal = Init->codegen();
				816	if (!InitVal)
				817	return nullptr;
				818	} else { // If not specified, use 0.0.
				819	InitVal = ConstantFP::get(TheContext, APFloat(0.0));
				820	}
				821
				822	AllocaInst *Alloca = CreateEntryBlockAlloca(TheFunction, VarName);
				823	Builder.CreateStore(InitVal, Alloca);
				824
				825	// Remember the old variable binding so that we can restore the binding when
				826	// we unrecurse.
				827	OldBindings.push_back(NamedValues[VarName]);
				828
				829	// Remember this binding.
				830	NamedValues[VarName] = Alloca;
				831	}
				832
				833	There are more comments here than code. The basic idea is that we emit
				834	the initializer, create the alloca, then update the symbol table to
				835	point to it. Once all the variables are installed in the symbol table,
				836	we evaluate the body of the var/in expression:
				837
				838	.. code-block:: c++
				839
				840	// Codegen the body, now that all vars are in scope.
				841	Value *BodyVal = Body->codegen();
				842	if (!BodyVal)
				843	return nullptr;
				844
				845	Finally, before returning, we restore the previous variable bindings:
				846
				847	.. code-block:: c++
				848
				849	// Pop all our variables from scope.
				850	for (unsigned i = 0, e = VarNames.size(); i != e; ++i)
				851	NamedValues[VarNames[i].first] = OldBindings[i];
				852
				853	// Return the body computation.
				854	return BodyVal;
				855	}
				856
				857	The end result of all of this is that we get properly scoped variable
				858	definitions, and we even (trivially) allow mutation of them :).
				859
				860	With this, we completed what we set out to do. Our nice iterative fib
				861	example from the intro compiles and runs just fine. The mem2reg pass
				862	optimizes all of our stack variables into SSA registers, inserting PHI
				863	nodes where needed, and our front-end remains simple: no "iterated
				864	dominance frontier" computation anywhere in sight.
				865
				866	Full Code Listing
				867	=================
				868
				869	Here is the complete code listing for our running example, enhanced with
				870	mutable variables and var/in support. To build this example, use:
				871
				872	.. code-block:: bash
				873
				874	# Compile
				875	clang++ -g toy.cpp `llvm-config --cxxflags --ldflags --system-libs --libs core mcjit native` -O3 -o toy
				876	# Run
				877	./toy
				878
				879	Here is the code:
				880
Hans Wennborg	147e0dd	2019-04-11 07:30:56 +0000	[diff] [blame]	881	.. literalinclude:: ../../../examples/Kaleidoscope/Chapter7/toy.cpp
Chris Lattner	d80f118	2019-04-07 13:14:23 +0000	[diff] [blame]	882	:language: c++
				883
				884	`Next: Compiling to Object Code <LangImpl08.html>`_
				885