blob: 1f9bef886a3e0eb5e1800bb14ff8c5ab5b97511f [file] [log] [blame]
Owen Andersonc04ee692007-10-23 06:03:24 +00001<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01//EN"
2 "http://www.w3.org/TR/html4/strict.dtd">
3
4<html>
5<head>
6 <title>LLVM Tutorial 2: A More Complicated Function</title>
7 <meta http-equiv="Content-Type" content="text/html; charset=utf-8">
8 <meta name="author" content="Owen Anderson">
9 <meta name="description"
10 content="LLVM Tutorial 2: A More Complicated Function.">
11 <link rel="stylesheet" href="../llvm.css" type="text/css">
12</head>
13
14<body>
15
16<div class="doc_title"> LLVM Tutorial 2: A More Complicated Function </div>
17
18<div class="doc_author">
19 <p>Written by <a href="mailto:owen@apple.com">Owen Anderson</a></p>
20</div>
21
22<!-- *********************************************************************** -->
23<div class="doc_section"><a name="intro">Code Samples</a></div>
24<!-- *********************************************************************** -->
25
26<div class="doc_text">
Owen Andersona7dfb752007-10-23 06:05:37 +000027All the code in this example can be downloaded at <a href="Tutorial2.tar.bz2">Tutorial2.tar.bz2</a> or <a href="Tutorial2.zip">Tutorial2.zip</a>.
Owen Andersonc04ee692007-10-23 06:03:24 +000028</div>
29
30<!-- *********************************************************************** -->
31<div class="doc_section"><a name="intro">A First Function</a></div>
32<!-- *********************************************************************** -->
33
34<div class="doc_text">
35
36<p>Now that we understand the basics of creating functions in LLVM, let's move on to a more complicated example: something with control flow. As an example, let's consider Euclid's Greatest Common Denominator (GCD) algorithm:</p>
37
38<div class="doc_code">
39<pre>
40unsigned gcd(unsigned x, unsigned y) {
41 if(x == y) {
42 return x;
43 } else if(x < y) {
44 return gcd(x, y - x);
45 } else {
46 return gcd(x - y, y);
47 }
48}
49</pre>
50</div>
51
52<p>With this example, we'll learn how to create functions with multiple blocks and control flow, and how to make function calls within your LLVM code. For starters, consider the diagram below.</p>
53
Owen Andersonbad82d82007-10-23 06:22:21 +000054<div style="text-align: center;"><img src="JITTutorial2-1.png" alt="GCD CFG" width="60%"></div>
Owen Andersonc04ee692007-10-23 06:03:24 +000055
Owen Andersonbad82d82007-10-23 06:22:21 +000056<p>The above is a graphical representation of a program in LLVM IR. It places each basic block on a node of a graph, and uses directed edges to indicate flow control. These blocks will be serialized when written to a text or bitcode file, but it is often useful conceptually to think of them as a graph. Again, if you are unsure about the code in the diagram, you should skim through the <a href="../LangRef.html">LLVM Language Reference Manual</a> and convince yourself that it is, in fact, the GCD algorithm.</p>
Owen Andersonc04ee692007-10-23 06:03:24 +000057
58<p>The first part of our code is the same as from first tutorial. The same basic setup is required: creating a module, verifying it, and running the <code>PrintModulePass</code> on it. Even the first segment of <code>makeLLVMModule()</code> looks the same, because <code>gcd</code> happens the have the same prototype as our <code>mul_add</code> function.</p>
59
60<div class="doc_code">
61<pre>
62#include &lt;llvm/Module.h&gt;
63#include &lt;llvm/Function.h&gt;
64#include &lt;llvm/PassManager.h&gt;
65#include &lt;llvm/Analysis/Verifier.h&gt;
66#include &lt;llvm/Assembly/PrintModulePass.h&gt;
67#include &lt;llvm/Support/LLVMBuilder.h&gt;
68
69using namespace llvm;
70
71Module* makeLLVMModule();
72
73int main(int argc, char**argv) {
74 Module* Mod = makeLLVMModule();
75
76 verifyModule(*Mod, PrintMessageAction);
77
78 PassManager PM;
Owen Andersonbad82d82007-10-23 06:22:21 +000079 PM.add(new PrintModulePass(&amp;llvm::cout));
Owen Andersonc04ee692007-10-23 06:03:24 +000080 PM.run(*Mod);
81
82 return 0;
83}
84
85Module* makeLLVMModule() {
Owen Andersonbad82d82007-10-23 06:22:21 +000086 Module* mod = new Module(&quot;tut2&quot;);
Owen Andersonc04ee692007-10-23 06:03:24 +000087
Owen Andersonbad82d82007-10-23 06:22:21 +000088 Constant* c = mod-&gt;getOrInsertFunction(&quot;gcd&quot;,
Owen Andersonc04ee692007-10-23 06:03:24 +000089 IntegerType::get(32),
90 IntegerType::get(32),
91 IntegerType::get(32),
92 NULL);
Owen Andersonbad82d82007-10-23 06:22:21 +000093 Function* gcd = cast&lt;Function&gt;(c);
Owen Andersonc04ee692007-10-23 06:03:24 +000094
Owen Andersonbad82d82007-10-23 06:22:21 +000095 Function::arg_iterator args = gcd-&gt;arg_begin();
Owen Andersonc04ee692007-10-23 06:03:24 +000096 Value* x = args++;
Owen Andersonbad82d82007-10-23 06:22:21 +000097 x-&gt;setName(&quot;x&quot;);
Owen Andersonc04ee692007-10-23 06:03:24 +000098 Value* y = args++;
Owen Andersonbad82d82007-10-23 06:22:21 +000099 y-&gt;setName(&quot;y&quot;);
Owen Andersonc04ee692007-10-23 06:03:24 +0000100</pre>
101</div>
102
103<p>Here, however, is where our code begins to diverge from the first tutorial. Because <code>gcd</code> has control flow, it is composed of multiple blocks interconnected by branching (<code>br</code>) instructions. For those familiar with assembly language, a block is similar to a labeled set of instructions. For those not familiar with assembly language, a block is basically a set of instructions that can be branched to and is executed linearly until the block is terminated by one of a small number of control flow instructions, such as <code>br</code> or <code>ret</code>.</p>
104
105<p>Blocks corresponds to the nodes in the diagram we looked at in the beginning of this tutorial. From the diagram, we can see that this function contains five blocks, so we'll go ahead and create them. Note that, in this code sample, we're making use of LLVM's automatic name uniquing, since we're giving two blocks the same name.</p>
106
107<div class="doc_code">
108<pre>
Owen Andersonbad82d82007-10-23 06:22:21 +0000109 BasicBlock* entry = new BasicBlock(&quot;entry&quot;, gcd);
110 BasicBlock* ret = new BasicBlock(&quot;return&quot;, gcd);
111 BasicBlock* cond_false = new BasicBlock(&quot;cond_false&quot;, gcd);
112 BasicBlock* cond_true = new BasicBlock(&quot;cond_true&quot;, gcd);
113 BasicBlock* cond_false_2 = new BasicBlock(&quot;cond_false&quot;, gcd);
Owen Andersonc04ee692007-10-23 06:03:24 +0000114</pre>
115</div>
116
117<p>Now, we're ready to begin generate code! We'll start with the <code>entry</code> block. This block corresponds to the top-level if-statement in the original C code, so we need to compare <code>x == y</code> To achieve this, we perform an explicity comparison using <code>ICmpEQ</code>. <code>ICmpEQ</code> stands for an <em>integer comparison for equality</em> and returns a 1-bit integer result. This 1-bit result is then used as the input to a conditional branch, with <code>ret</code> as the <code>true</code> and <code>cond_false</code> as the <code>false</code> case.</p>
118
119<div class="doc_code">
120<pre>
121 LLVMBuilder builder(entry);
Owen Andersonbad82d82007-10-23 06:22:21 +0000122 Value* xEqualsY = builder.CreateICmpEQ(x, y, &quot;tmp&quot;);
Owen Andersonc04ee692007-10-23 06:03:24 +0000123 builder.CreateCondBr(xEqualsY, ret, cond_false);
124</pre>
125</div>
126
127<p>Our next block, <code>ret</code>, is pretty simple: it just returns the value of <code>x</code>. Recall that this block is only reached if <code>x == y</code>, so this is the correct behavior. Notice that, instead of creating a new <code>LLVMBuilder</code> for each block, we can use <code>SetInsertPoint</code> to retarget our existing one. This saves on construction and memory allocation costs.</p>
128
129<div class="doc_code">
130<pre>
131 builder.SetInsertPoint(ret);
132 builder.CreateRet(x);
133</pre>
134</div>
135
136<p><code>cond_false</code> is a more interesting block: we now know that <code>x != y</code>, so we must branch again to determine which of <code>x</code> and <code>y</code> is larger. This is achieved using the <code>ICmpULT</code> instruction, which stands for <em>integer comparison for unsigned less-than</em>. In LLVM, integer types do not carry sign; a 32-bit integer pseudo-register can interpreted as signed or unsigned without casting. Whether a signed or unsigned interpretation is desired is specified in the instruction. This is why several instructions in the LLVM IR, such as integer less-than, include a specifier for signed or unsigned.</p>
137
138<p>Also, note that we're again making use of LLVM's automatic name uniquing, this time at a register level. We've deliberately chosen to name every instruction "tmp", to illustrate that LLVM will give them all unique names without getting confused.</p>
139
140<div class="doc_code">
141<pre>
142 builder.SetInsertPoint(cond_false);
Owen Andersonbad82d82007-10-23 06:22:21 +0000143 Value* xLessThanY = builder.CreateICmpULT(x, y, &quot;tmp&quot;);
Owen Andersonc04ee692007-10-23 06:03:24 +0000144 builder.CreateCondBr(xLessThanY, cond_true, cond_false_2);
145</pre>
146</div>
147
148<p>Our last two blocks are quite similar; they're both recursive calls to <code>gcd</code> with different parameters. To create a call instruction, we have to create a <code>vector</code> (or any other container with <code>InputInterator</code>s) to hold the arguments. We then pass in the beginning and ending iterators for this vector.</p>
149
150<div class="doc_code">
151<pre>
152 builder.SetInsertPoint(cond_true);
Owen Andersonbad82d82007-10-23 06:22:21 +0000153 Value* yMinusX = builder.CreateSub(y, x, &quot;tmp&quot;);
154 std::vector&lt;Value*&gt; args1;
Owen Andersonc04ee692007-10-23 06:03:24 +0000155 args1.push_back(x);
156 args1.push_back(yMinusX);
Owen Andersonbad82d82007-10-23 06:22:21 +0000157 Value* recur_1 = builder.CreateCall(gcd, args1.begin(), args1.end(), &quot;tmp&quot;);
Owen Andersonc04ee692007-10-23 06:03:24 +0000158 builder.CreateRet(recur_1);
159
160 builder.SetInsertPoint(cond_false_2);
Owen Andersonbad82d82007-10-23 06:22:21 +0000161 Value* xMinusY = builder.CreateSub(x, y, &quot;tmp&quot;);
162 std::vector&lt;Value*&gt; args2;
Owen Andersonc04ee692007-10-23 06:03:24 +0000163 args2.push_back(xMinusY);
164 args2.push_back(y);
Owen Andersonbad82d82007-10-23 06:22:21 +0000165 Value* recur_2 = builder.CreateCall(gcd, args2.begin(), args2.end(), &quot;tmp&quot;);
Owen Andersonc04ee692007-10-23 06:03:24 +0000166 builder.CreateRet(recur_2);
167
168 return mod;
169}
170</pre>
171</div>
172
173<p>And that's it! You can compile your code and execute your code in the same way as before, by executing:</p>
174
175<div class="doc_code">
176<pre>
177# c++ -g tut2.cpp `llvm-config --cppflags` `llvm-config --ldflags` \
178 `llvm-config --libs core` -o tut2
179# ./tut2
180</pre>
181</div>
182
183</div>
184
185</body>
186</html>