blob: 17ff78c2f0e8692f8a08e7d78239bf03816357ec [file] [log] [blame]
Owen Andersonc04ee692007-10-23 06:03:24 +00001<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01//EN"
2 "http://www.w3.org/TR/html4/strict.dtd">
3
4<html>
5<head>
6 <title>LLVM Tutorial 2: A More Complicated Function</title>
7 <meta http-equiv="Content-Type" content="text/html; charset=utf-8">
8 <meta name="author" content="Owen Anderson">
9 <meta name="description"
10 content="LLVM Tutorial 2: A More Complicated Function.">
11 <link rel="stylesheet" href="../llvm.css" type="text/css">
12</head>
13
14<body>
15
16<div class="doc_title"> LLVM Tutorial 2: A More Complicated Function </div>
17
18<div class="doc_author">
19 <p>Written by <a href="mailto:owen@apple.com">Owen Anderson</a></p>
20</div>
21
22<!-- *********************************************************************** -->
Owen Andersonc04ee692007-10-23 06:03:24 +000023<div class="doc_section"><a name="intro">A First Function</a></div>
24<!-- *********************************************************************** -->
25
26<div class="doc_text">
27
28<p>Now that we understand the basics of creating functions in LLVM, let's move on to a more complicated example: something with control flow. As an example, let's consider Euclid's Greatest Common Denominator (GCD) algorithm:</p>
29
30<div class="doc_code">
31<pre>
32unsigned gcd(unsigned x, unsigned y) {
33 if(x == y) {
34 return x;
Chris Lattner729eb142008-02-10 19:11:04 +000035 } else if(x &lt; y) {
Owen Andersonc04ee692007-10-23 06:03:24 +000036 return gcd(x, y - x);
37 } else {
38 return gcd(x - y, y);
39 }
40}
41</pre>
42</div>
43
44<p>With this example, we'll learn how to create functions with multiple blocks and control flow, and how to make function calls within your LLVM code. For starters, consider the diagram below.</p>
45
Owen Andersonbad82d82007-10-23 06:22:21 +000046<div style="text-align: center;"><img src="JITTutorial2-1.png" alt="GCD CFG" width="60%"></div>
Owen Andersonc04ee692007-10-23 06:03:24 +000047
Chris Lattner729eb142008-02-10 19:11:04 +000048<p>This is a graphical representation of a program in LLVM IR. It places each basic block on a node of a graph and uses directed edges to indicate flow control. These blocks will be serialized when written to a text or bitcode file, but it is often useful conceptually to think of them as a graph. Again, if you are unsure about the code in the diagram, you should skim through the <a href="../LangRef.html">LLVM Language Reference Manual</a> and convince yourself that it is, in fact, the GCD algorithm.</p>
Owen Andersonc04ee692007-10-23 06:03:24 +000049
Owen Andersonf4bc9b12007-11-19 16:10:59 +000050<p>The first part of our code is practically the same as from the first tutorial. The same basic setup is required: creating a module, verifying it, and running the <code>PrintModulePass</code> on it. Even the first segment of <code>makeLLVMModule()</code> looks essentially the same, except that <code>gcd</code> takes one fewer parameter than <code>mul_add</code>.</p>
Owen Andersonc04ee692007-10-23 06:03:24 +000051
52<div class="doc_code">
53<pre>
54#include &lt;llvm/Module.h&gt;
55#include &lt;llvm/Function.h&gt;
56#include &lt;llvm/PassManager.h&gt;
57#include &lt;llvm/Analysis/Verifier.h&gt;
58#include &lt;llvm/Assembly/PrintModulePass.h&gt;
Duncan Sands89f6d882008-04-13 06:22:09 +000059#include &lt;llvm/Support/IRBuilder.h&gt;
Owen Andersonc04ee692007-10-23 06:03:24 +000060
61using namespace llvm;
62
63Module* makeLLVMModule();
64
65int main(int argc, char**argv) {
66 Module* Mod = makeLLVMModule();
67
68 verifyModule(*Mod, PrintMessageAction);
69
70 PassManager PM;
Owen Andersonbad82d82007-10-23 06:22:21 +000071 PM.add(new PrintModulePass(&amp;llvm::cout));
Owen Andersonc04ee692007-10-23 06:03:24 +000072 PM.run(*Mod);
Bill Wendling6acc8592008-05-19 00:20:45 +000073
74 delete Mod;
Owen Andersonc04ee692007-10-23 06:03:24 +000075 return 0;
76}
77
78Module* makeLLVMModule() {
Owen Andersonbad82d82007-10-23 06:22:21 +000079 Module* mod = new Module(&quot;tut2&quot;);
Owen Andersonc04ee692007-10-23 06:03:24 +000080
Owen Andersonbad82d82007-10-23 06:22:21 +000081 Constant* c = mod-&gt;getOrInsertFunction(&quot;gcd&quot;,
Owen Andersonc04ee692007-10-23 06:03:24 +000082 IntegerType::get(32),
83 IntegerType::get(32),
84 IntegerType::get(32),
85 NULL);
Owen Andersonbad82d82007-10-23 06:22:21 +000086 Function* gcd = cast&lt;Function&gt;(c);
Owen Andersonc04ee692007-10-23 06:03:24 +000087
Owen Andersonbad82d82007-10-23 06:22:21 +000088 Function::arg_iterator args = gcd-&gt;arg_begin();
Owen Andersonc04ee692007-10-23 06:03:24 +000089 Value* x = args++;
Owen Andersonbad82d82007-10-23 06:22:21 +000090 x-&gt;setName(&quot;x&quot;);
Owen Andersonc04ee692007-10-23 06:03:24 +000091 Value* y = args++;
Owen Andersonbad82d82007-10-23 06:22:21 +000092 y-&gt;setName(&quot;y&quot;);
Owen Andersonc04ee692007-10-23 06:03:24 +000093</pre>
94</div>
95
96<p>Here, however, is where our code begins to diverge from the first tutorial. Because <code>gcd</code> has control flow, it is composed of multiple blocks interconnected by branching (<code>br</code>) instructions. For those familiar with assembly language, a block is similar to a labeled set of instructions. For those not familiar with assembly language, a block is basically a set of instructions that can be branched to and is executed linearly until the block is terminated by one of a small number of control flow instructions, such as <code>br</code> or <code>ret</code>.</p>
97
Chris Lattner729eb142008-02-10 19:11:04 +000098<p>Blocks correspond to the nodes in the diagram we looked at in the beginning of this tutorial. From the diagram, we can see that this function contains five blocks, so we'll go ahead and create them. Note that we're making use of LLVM's automatic name uniquing in this code sample, since we're giving two blocks the same name.</p>
Owen Andersonc04ee692007-10-23 06:03:24 +000099
100<div class="doc_code">
101<pre>
Gabor Greifdf7d2b42008-04-19 22:25:09 +0000102 BasicBlock* entry = BasicBlock::Create(&quot;entry&quot;, gcd);
103 BasicBlock* ret = BasicBlock::Create(&quot;return&quot;, gcd);
104 BasicBlock* cond_false = BasicBlock::Create(&quot;cond_false&quot;, gcd);
105 BasicBlock* cond_true = BasicBlock::Create(&quot;cond_true&quot;, gcd);
106 BasicBlock* cond_false_2 = BasicBlock::Create(&quot;cond_false&quot;, gcd);
Owen Andersonc04ee692007-10-23 06:03:24 +0000107</pre>
108</div>
109
Chris Lattner729eb142008-02-10 19:11:04 +0000110<p>Now we're ready to begin generating code! We'll start with the <code>entry</code> block. This block corresponds to the top-level if-statement in the original C code, so we need to compare <code>x</code> and <code>y</code>. To achieve this, we perform an explicit comparison using <code>ICmpEQ</code>. <code>ICmpEQ</code> stands for an <em>integer comparison for equality</em> and returns a 1-bit integer result. This 1-bit result is then used as the input to a conditional branch, with <code>ret</code> as the <code>true</code> and <code>cond_false</code> as the <code>false</code> case.</p>
Owen Andersonc04ee692007-10-23 06:03:24 +0000111
112<div class="doc_code">
113<pre>
Duncan Sands89f6d882008-04-13 06:22:09 +0000114 IRBuilder builder(entry);
Owen Andersonbad82d82007-10-23 06:22:21 +0000115 Value* xEqualsY = builder.CreateICmpEQ(x, y, &quot;tmp&quot;);
Owen Andersonc04ee692007-10-23 06:03:24 +0000116 builder.CreateCondBr(xEqualsY, ret, cond_false);
117</pre>
118</div>
119
Duncan Sands89f6d882008-04-13 06:22:09 +0000120<p>Our next block, <code>ret</code>, is pretty simple: it just returns the value of <code>x</code>. Recall that this block is only reached if <code>x == y</code>, so this is the correct behavior. Notice that instead of creating a new <code>IRBuilder</code> for each block, we can use <code>SetInsertPoint</code> to retarget our existing one. This saves on construction and memory allocation costs.</p>
Owen Andersonc04ee692007-10-23 06:03:24 +0000121
122<div class="doc_code">
123<pre>
124 builder.SetInsertPoint(ret);
125 builder.CreateRet(x);
126</pre>
127</div>
128
129<p><code>cond_false</code> is a more interesting block: we now know that <code>x != y</code>, so we must branch again to determine which of <code>x</code> and <code>y</code> is larger. This is achieved using the <code>ICmpULT</code> instruction, which stands for <em>integer comparison for unsigned less-than</em>. In LLVM, integer types do not carry sign; a 32-bit integer pseudo-register can interpreted as signed or unsigned without casting. Whether a signed or unsigned interpretation is desired is specified in the instruction. This is why several instructions in the LLVM IR, such as integer less-than, include a specifier for signed or unsigned.</p>
130
Chris Lattner729eb142008-02-10 19:11:04 +0000131<p>Also note that we're again making use of LLVM's automatic name uniquing, this time at a register level. We've deliberately chosen to name every instruction "tmp" to illustrate that LLVM will give them all unique names without getting confused.</p>
Owen Andersonc04ee692007-10-23 06:03:24 +0000132
133<div class="doc_code">
134<pre>
135 builder.SetInsertPoint(cond_false);
Owen Andersonbad82d82007-10-23 06:22:21 +0000136 Value* xLessThanY = builder.CreateICmpULT(x, y, &quot;tmp&quot;);
Owen Andersonc04ee692007-10-23 06:03:24 +0000137 builder.CreateCondBr(xLessThanY, cond_true, cond_false_2);
138</pre>
139</div>
140
141<p>Our last two blocks are quite similar; they're both recursive calls to <code>gcd</code> with different parameters. To create a call instruction, we have to create a <code>vector</code> (or any other container with <code>InputInterator</code>s) to hold the arguments. We then pass in the beginning and ending iterators for this vector.</p>
142
143<div class="doc_code">
144<pre>
145 builder.SetInsertPoint(cond_true);
Owen Andersonbad82d82007-10-23 06:22:21 +0000146 Value* yMinusX = builder.CreateSub(y, x, &quot;tmp&quot;);
147 std::vector&lt;Value*&gt; args1;
Owen Andersonc04ee692007-10-23 06:03:24 +0000148 args1.push_back(x);
149 args1.push_back(yMinusX);
Owen Andersonbad82d82007-10-23 06:22:21 +0000150 Value* recur_1 = builder.CreateCall(gcd, args1.begin(), args1.end(), &quot;tmp&quot;);
Owen Andersonc04ee692007-10-23 06:03:24 +0000151 builder.CreateRet(recur_1);
152
153 builder.SetInsertPoint(cond_false_2);
Owen Andersonbad82d82007-10-23 06:22:21 +0000154 Value* xMinusY = builder.CreateSub(x, y, &quot;tmp&quot;);
155 std::vector&lt;Value*&gt; args2;
Owen Andersonc04ee692007-10-23 06:03:24 +0000156 args2.push_back(xMinusY);
157 args2.push_back(y);
Owen Andersonbad82d82007-10-23 06:22:21 +0000158 Value* recur_2 = builder.CreateCall(gcd, args2.begin(), args2.end(), &quot;tmp&quot;);
Owen Andersonc04ee692007-10-23 06:03:24 +0000159 builder.CreateRet(recur_2);
160
161 return mod;
162}
163</pre>
164</div>
165
Duncan Sandse0a34352007-11-05 15:15:50 +0000166<p>And that's it! You can compile and execute your code in the same way as before, by doing:</p>
Owen Andersonc04ee692007-10-23 06:03:24 +0000167
168<div class="doc_code">
169<pre>
Owen Andersonaec96002008-03-24 21:38:01 +0000170# c++ -g tut2.cpp `llvm-config --cxxflags --ldflags --libs core` -o tut2
Owen Andersonc04ee692007-10-23 06:03:24 +0000171# ./tut2
172</pre>
173</div>
174
175</div>
176
Owen Anderson34ba67a2007-10-25 06:45:01 +0000177<!-- *********************************************************************** -->
178<hr>
179<address>
180 <a href="http://jigsaw.w3.org/css-validator/check/referer"><img
181 src="http://jigsaw.w3.org/css-validator/images/vcss" alt="Valid CSS!"></a>
182 <a href="http://validator.w3.org/check/referer"><img
183 src="http://www.w3.org/Icons/valid-html401" alt="Valid HTML 4.01!"></a>
184
185 <a href="mailto:owen@apple.com">Owen Anderson</a><br>
186 <a href="http://llvm.org">The LLVM Compiler Infrastructure</a><br>
187 Last modified: $Date: 2007-10-17 11:05:13 -0700 (Wed, 17 Oct 2007) $
188</address>
189
Owen Andersonc04ee692007-10-23 06:03:24 +0000190</body>
Duncan Sandse0a34352007-11-05 15:15:50 +0000191</html>