blob: 291527be8d4e81d6dc2d29fe415ffe25b2052417 [file] [log] [blame]
Andreas Bollecd5c7c2012-06-12 09:05:03 +02001<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">
2<html lang="en">
3<head>
4 <meta http-equiv="content-type" content="text/html; charset=utf-8">
5 <title>llvmpipe</title>
6 <link rel="stylesheet" type="text/css" href="mesa.css">
7</head>
8<body>
Brian Paul0da2a222011-04-07 13:56:45 -06009
Andreas Bollb5da52a2012-09-18 18:57:02 +020010<div class="header">
11 <h1>The Mesa 3D Graphics Library</h1>
12</div>
13
14<iframe src="contents.html"></iframe>
15<div class="content">
16
Andreas Bollecd5c7c2012-06-12 09:05:03 +020017<h1>Introduction</h1>
Brian Paul0da2a222011-04-07 13:56:45 -060018
19<p>
20The Gallium llvmpipe driver is a software rasterizer that uses LLVM to
21do runtime code generation.
22Shaders, point/line/triangle rasterization and vertex processing are
23implemented with LLVM IR which is translated to x86 or x86-64 machine
24code.
25Also, the driver is multithreaded to take advantage of multiple CPU cores
26(up to 8 at this time).
27It's the fastest software rasterizer for Mesa.
28</p>
José Fonseca9285f152009-08-10 12:35:16 +010029
30
Brian Paul0da2a222011-04-07 13:56:45 -060031<h1>Requirements</h1>
José Fonseca9285f152009-08-10 12:35:16 +010032
José Fonseca65d0c842011-11-05 10:38:16 +000033<ul>
34<li>
Andreas Bollfd64b392012-06-12 09:05:49 +020035 <p>An x86 or amd64 processor; 64-bit mode recommended.</p>
Brian Paul0da2a222011-04-07 13:56:45 -060036 <p>
Matt Turner9f52b872011-11-05 17:11:59 -040037 Support for SSE2 is strongly encouraged. Support for SSSE3 and SSE4.1 will
38 yield the most efficient code. The fewer features the CPU has the more
Andreas Bollfd64b392012-06-12 09:05:49 +020039 likely is that you run into underperforming, buggy, or incomplete code.
Brian Paul0da2a222011-04-07 13:56:45 -060040 </p>
41 <p>
José Fonsecada1c4022009-11-26 11:15:08 +000042 See /proc/cpuinfo to know what your CPU supports.
Brian Paul0da2a222011-04-07 13:56:45 -060043 </p>
José Fonseca65d0c842011-11-05 10:38:16 +000044</li>
45<li>
José Fonseca76bf4bd2014-05-29 20:02:31 +010046 <p>LLVM: version 3.4 recommended; 3.1 or later required.</p>
Brian Paul0da2a222011-04-07 13:56:45 -060047 <p>
José Fonseca12576552010-01-10 18:37:07 +000048 For Linux, on a recent Debian based distribution do:
Brian Paul0da2a222011-04-07 13:56:45 -060049 </p>
50<pre>
José Fonseca9285f152009-08-10 12:35:16 +010051 aptitude install llvm-dev
Brian Paul0da2a222011-04-07 13:56:45 -060052</pre>
Andreas Bolldf2be222012-06-12 09:05:22 +020053 <p>
Brian Paul0da2a222011-04-07 13:56:45 -060054 For a RPM-based distribution do:
55 </p>
56<pre>
57 yum install llvm-devel
58</pre>
José Fonseca19b31d02009-08-10 15:43:04 +010059
Brian Paul0da2a222011-04-07 13:56:45 -060060 <p>
José Fonseca65d0c842011-11-05 10:38:16 +000061 For Windows you will need to build LLVM from source with MSVC or MINGW
62 (either natively or through cross compilers) and CMake, and set the LLVM
63 environment variable to the directory you installed it to.
José Fonseca19b31d02009-08-10 15:43:04 +010064
José Fonseca65d0c842011-11-05 10:38:16 +000065 LLVM will be statically linked, so when building on MSVC it needs to be
66 built with a matching CRT as Mesa, and you'll need to pass
67 -DLLVM_USE_CRT_RELEASE=MTd for debug and checked builds,
68 -DLLVM_USE_CRT_RELEASE=MTd for profile and release builds.
José Fonsecaf379e7d2010-05-13 16:18:05 +010069
José Fonseca65d0c842011-11-05 10:38:16 +000070 You can build only the x86 target by passing -DLLVM_TARGETS_TO_BUILD=X86
71 to cmake.
Brian Paul0da2a222011-04-07 13:56:45 -060072 </p>
José Fonseca65d0c842011-11-05 10:38:16 +000073</li>
José Fonsecaf379e7d2010-05-13 16:18:05 +010074
José Fonseca65d0c842011-11-05 10:38:16 +000075<li>
76 <p>scons (optional)</p>
77</li>
78</ul>
79
José Fonseca9285f152009-08-10 12:35:16 +010080
Brian Paul0da2a222011-04-07 13:56:45 -060081<h1>Building</h1>
José Fonseca9285f152009-08-10 12:35:16 +010082
José Fonseca12576552010-01-10 18:37:07 +000083To build everything on Linux invoke scons as:
José Fonseca9285f152009-08-10 12:35:16 +010084
Brian Paul0da2a222011-04-07 13:56:45 -060085<pre>
José Fonseca601498a2010-11-01 13:30:22 +000086 scons build=debug libgl-xlib
Brian Paul0da2a222011-04-07 13:56:45 -060087</pre>
José Fonseca9285f152009-08-10 12:35:16 +010088
José Fonseca5811ed82009-08-22 22:26:55 +010089Alternatively, you can build it with GNU make, if you prefer, by invoking it as
90
Brian Paul0da2a222011-04-07 13:56:45 -060091<pre>
José Fonseca5811ed82009-08-22 22:26:55 +010092 make linux-llvm
Brian Paul0da2a222011-04-07 13:56:45 -060093</pre>
José Fonseca5811ed82009-08-22 22:26:55 +010094
José Fonseca1fc41002009-09-11 11:24:00 +010095but the rest of these instructions assume that scons is used.
José Fonseca5811ed82009-08-22 22:26:55 +010096
José Fonseca65d0c842011-11-05 10:38:16 +000097For Windows the procedure is similar except the target:
José Fonseca12576552010-01-10 18:37:07 +000098
Brian Paul0da2a222011-04-07 13:56:45 -060099<pre>
José Fonseca76bf4bd2014-05-29 20:02:31 +0100100 scons platform=windows build=debug libgl-gdi
Brian Paul0da2a222011-04-07 13:56:45 -0600101</pre>
José Fonseca9285f152009-08-10 12:35:16 +0100102
Brian Paul0da2a222011-04-07 13:56:45 -0600103
104<h1>Using</h1>
José Fonseca9285f152009-08-10 12:35:16 +0100105
José Fonseca76bf4bd2014-05-29 20:02:31 +0100106<h2>Linux</h2>
107
108<p>On Linux, building will create a drop-in alternative for libGL.so into</p>
José Fonseca9285f152009-08-10 12:35:16 +0100109
Brian Paul0da2a222011-04-07 13:56:45 -0600110<pre>
José Fonseca601498a2010-11-01 13:30:22 +0000111 build/foo/gallium/targets/libgl-xlib/libGL.so
Brian Paul0da2a222011-04-07 13:56:45 -0600112</pre>
113or
114<pre>
115 lib/gallium/libGL.so
116</pre>
José Fonseca9285f152009-08-10 12:35:16 +0100117
José Fonseca76bf4bd2014-05-29 20:02:31 +0100118<p>To use it set the LD_LIBRARY_PATH environment variable accordingly.</p>
José Fonseca5811ed82009-08-22 22:26:55 +0100119
José Fonseca76bf4bd2014-05-29 20:02:31 +0100120<p>For performance evaluation pass build=release to scons, and use the corresponding
121lib directory without the "-debug" suffix.</p>
José Fonseca1fc41002009-09-11 11:24:00 +0100122
José Fonseca76bf4bd2014-05-29 20:02:31 +0100123
124<h2>Windows</h2>
125
126<p>
127On Windows, building will create
128<code>build/windows-x86-debug/gallium/targets/libgl-gdi/opengl32.dll</code>
129which is a drop-in alternative for system's <code>opengl32.dll</code>. To use
130it put it in the same directory as your application. It can also be used by
José Fonseca12576552010-01-10 18:37:07 +0000131replacing the native ICD driver, but it's quite an advanced usage, so if you
132need to ask, don't even try it.
José Fonseca76bf4bd2014-05-29 20:02:31 +0100133</p>
134
135<p>
136There is however an easy way to replace the OpenGL software renderer that comes
137with Microsoft Windows 7 (or later) with llvmpipe (that is, on systems without
138any OpenGL drivers):
139</p>
140
141<ul>
142 <li><p>copy build/windows-x86-debug/gallium/targets/libgl-gdi/opengl32.dll to C:\Windows\SysWOW64\mesadrv.dll</p></li>
143 <li><p>load this registry settings:</p>
144 <pre>REGEDIT4
145
146; http://technet.microsoft.com/en-us/library/cc749368.aspx
147; http://www.msfn.org/board/topic/143241-portable-windows-7-build-from-winpe-30/page-5#entry942596
148[HKEY_LOCAL_MACHINE\SOFTWARE\Wow6432Node\Microsoft\Windows NT\CurrentVersion\OpenGLDrivers\MSOGL]
149"DLL"="mesadrv.dll"
150"DriverVersion"=dword:00000001
151"Flags"=dword:00000001
152"Version"=dword:00000002
153</pre>
154 </li>
155 <li>Ditto for 64 bits drivers if you need them.</li>
156</ul>
José Fonseca12576552010-01-10 18:37:07 +0000157
José Fonseca9285f152009-08-10 12:35:16 +0100158
Brian Paul0da2a222011-04-07 13:56:45 -0600159<h1>Profiling</h1>
José Fonseca388c9412010-09-21 17:50:30 +0100160
José Fonsecab8f68582013-04-17 13:32:15 +0100161<p>
162To profile llvmpipe you should build as
163</p>
Brian Paul0da2a222011-04-07 13:56:45 -0600164<pre>
Andreas Boll703a6622012-06-12 09:05:15 +0200165 scons build=profile &lt;same-as-before&gt;
Brian Paul0da2a222011-04-07 13:56:45 -0600166</pre>
José Fonseca388c9412010-09-21 17:50:30 +0100167
José Fonsecab8f68582013-04-17 13:32:15 +0100168<p>
José Fonseca388c9412010-09-21 17:50:30 +0100169This will ensure that frame pointers are used both in C and JIT functions, and
170that no tail call optimizations are done by gcc.
José Fonsecab8f68582013-04-17 13:32:15 +0100171</p>
José Fonseca388c9412010-09-21 17:50:30 +0100172
José Fonsecab8f68582013-04-17 13:32:15 +0100173<h2>Linux perf integration</h2>
174
175<p>
176On Linux, it is possible to have symbol resolution of JIT code with <a href="http://perf.wiki.kernel.org/">Linux perf</a>:
177</p>
José Fonseca388c9412010-09-21 17:50:30 +0100178
Brian Paul0da2a222011-04-07 13:56:45 -0600179<pre>
José Fonsecab8f68582013-04-17 13:32:15 +0100180 perf record -g /my/application
181 perf report
Brian Paul0da2a222011-04-07 13:56:45 -0600182</pre>
José Fonseca388c9412010-09-21 17:50:30 +0100183
José Fonsecab8f68582013-04-17 13:32:15 +0100184<p>
185When run inside Linux perf, llvmpipe will create a /tmp/perf-XXXXX.map file with
186symbol address table. It also dumps assembly code to /tmp/perf-XXXXX.map.asm,
187which can be used by the bin/perf-annotate-jit script to produce disassembly of
188the generated code annotated with the samples.
189</p>
José Fonseca388c9412010-09-21 17:50:30 +0100190
José Fonsecab8f68582013-04-17 13:32:15 +0100191<p>You can obtain a call graph via
192<a href="http://code.google.com/p/jrfonseca/wiki/Gprof2Dot#linux_perf">Gprof2Dot</a>.</p>
José Fonseca388c9412010-09-21 17:50:30 +0100193
194
Brian Paul0da2a222011-04-07 13:56:45 -0600195<h1>Unit testing</h1>
José Fonseca9285f152009-08-10 12:35:16 +0100196
Brian Paul0da2a222011-04-07 13:56:45 -0600197<p>
José Fonseca9285f152009-08-10 12:35:16 +0100198Building will also create several unit tests in
199build/linux-???-debug/gallium/drivers/llvmpipe:
Brian Paul0da2a222011-04-07 13:56:45 -0600200</p>
José Fonseca9285f152009-08-10 12:35:16 +0100201
Andreas Bolldf2be222012-06-12 09:05:22 +0200202<ul>
Brian Paul0da2a222011-04-07 13:56:45 -0600203<li> lp_test_blend: blending
204<li> lp_test_conv: SIMD vector conversion
205<li> lp_test_format: pixel unpacking/packing
206</ul>
José Fonseca9285f152009-08-10 12:35:16 +0100207
Brian Paul0da2a222011-04-07 13:56:45 -0600208<p>
José Fonseca1fc41002009-09-11 11:24:00 +0100209Some of this tests can output results and benchmarks to a tab-separated-file
José Fonseca89146cd2009-08-20 10:21:49 +0100210for posterior analysis, e.g.:
Brian Paul0da2a222011-04-07 13:56:45 -0600211</p>
212<pre>
José Fonseca5811ed82009-08-22 22:26:55 +0100213 build/linux-x86_64-debug/gallium/drivers/llvmpipe/lp_test_blend -o blend.tsv
Brian Paul0da2a222011-04-07 13:56:45 -0600214</pre>
José Fonseca9285f152009-08-10 12:35:16 +0100215
José Fonsecac5531f52009-08-21 10:57:48 +0100216
Brian Paul0da2a222011-04-07 13:56:45 -0600217<h1>Development Notes</h1>
José Fonsecac5531f52009-08-21 10:57:48 +0100218
Brian Paul0da2a222011-04-07 13:56:45 -0600219<ul>
220<li>
221 When looking to this code by the first time start in lp_state_fs.c, and
José Fonseca5811ed82009-08-22 22:26:55 +0100222 then skim through the lp_bld_* functions called in there, and the comments
Andreas Bollfd64b392012-06-12 09:05:49 +0200223 at the top of the lp_bld_*.c functions.
Brian Paul0da2a222011-04-07 13:56:45 -0600224</li>
225<li>
226 The driver-independent parts of the LLVM / Gallium code are found in
Brian Pauld0b35352010-03-15 11:46:41 -0600227 src/gallium/auxiliary/gallivm/. The filenames and function prefixes
228 need to be renamed from "lp_bld_" to something else though.
Brian Paul0da2a222011-04-07 13:56:45 -0600229</li>
230<li>
231 We use LLVM-C bindings for now. They are not documented, but follow the C++
José Fonsecac5531f52009-08-21 10:57:48 +0100232 interfaces very closely, and appear to be complete enough for code
233 generation. See
José Fonseca68b696e2013-11-21 17:52:50 +0000234 <a href="http://npcontemplation.blogspot.com/2008/06/secret-of-llvm-c-bindings.html">
235 this stand-alone example</a>. See the llvm-c/Core.h file for reference.
Brian Paul0da2a222011-04-07 13:56:45 -0600236</li>
237</ul>
Andreas Bollecd5c7c2012-06-12 09:05:03 +0200238
José Fonseca68b696e2013-11-21 17:52:50 +0000239<h1 id="recommended_reading">Recommended Reading</h1>
240
241<ul>
242 <li>
243 <p>Rasterization</p>
244 <ul>
245 <li><a href="http://www.cs.unc.edu/~olano/papers/2dh-tri/">Triangle Scan Conversion using 2D Homogeneous Coordinates</a></li>
246 <li><a href="http://www.drdobbs.com/parallel/rasterization-on-larrabee/217200602">Rasterization on Larrabee</a> (<a href="http://devmaster.net/posts/2887/rasterization-on-larrabee">DevMaster copy</a>)</li>
247 <li><a href="http://devmaster.net/posts/6133/rasterization-using-half-space-functions">Rasterization using half-space functions</a></li>
248 <li><a href="http://devmaster.net/posts/6145/advanced-rasterization">Advanced Rasterization</a></li>
249 <li><a href="http://fgiesen.wordpress.com/2013/02/17/optimizing-sw-occlusion-culling-index/">Optimizing Software Occlusion Culling</a></li>
250 </ul>
251 </li>
252 <li>
253 <p>Texture sampling</p>
254 <ul>
255 <li><a href="http://chrishecker.com/Miscellaneous_Technical_Articles#Perspective_Texture_Mapping">Perspective Texture Mapping</a></li>
256 <li><a href="http://www.flipcode.com/archives/Texturing_As_In_Unreal.shtml">Texturing As In Unreal</a></li>
257 <li><a href="http://www.gamasutra.com/view/feature/3301/runtime_mipmap_filtering.php">Run-Time MIP-Map Filtering</a></li>
258 <li><a href="http://alt.3dcenter.org/artikel/2003/10-26_a_english.php">Will "brilinear" filtering persist?</a></li>
259 <li><a href="http://ixbtlabs.com/articles2/gffx/nv40-rx800-3.html">Trilinear filtering</a></li>
260 <li><a href="http://devmaster.net/posts/12785/texture-swizzling">Texture Swizzling</a></li>
261 </ul>
262 </li>
263 <li>
264 <p>SIMD</p>
265 <ul>
266 <li><a href="http://www.cdl.uni-saarland.de/projects/wfv/#header4">Whole-Function Vectorization</a></li>
267 </ul>
268 </li>
269 <li>
270 <p>Optimization</p>
271 <ul>
272 <li><a href="http://www.drdobbs.com/optimizing-pixomatic-for-modern-x86-proc/184405807">Optimizing Pixomatic For Modern x86 Processors</a></li>
273 <li><a href="http://www.intel.com/content/www/us/en/architecture-and-technology/64-ia-32-architectures-optimization-manual.html">Intel 64 and IA-32 Architectures Optimization Reference Manual</a></li>
274 <li><a href="http://www.agner.org/optimize/">Software optimization resources</a></li>
275 <li><a href="http://software.intel.com/en-us/articles/intel-intrinsics-guide">Intel Intrinsics Guide</a><li>
276 </ul>
277 </li>
278 <li>
279 <p>LLVM</p>
280 <ul>
281 <li><a href="http://llvm.org/docs/LangRef.html">LLVM Language Reference Manual</a></li>
282 <li><a href="http://npcontemplation.blogspot.co.uk/2008/06/secret-of-llvm-c-bindings.html">The secret of LLVM C bindings</a></li>
283 </ul>
284 </li>
285 <li>
José Fonsecaeb0892b2013-11-25 08:28:23 +0000286 <p>General</p>
José Fonseca68b696e2013-11-21 17:52:50 +0000287 <ul>
José Fonsecaeb0892b2013-11-25 08:28:23 +0000288 <li><a href="http://fgiesen.wordpress.com/2011/07/09/a-trip-through-the-graphics-pipeline-2011-index/">A trip through the Graphics Pipeline</a></li>
José Fonseca68b696e2013-11-21 17:52:50 +0000289 <li><a href="http://msdn.microsoft.com/en-us/library/gg615082.aspx#architecture">WARP Architecture and Performance</a></li>
290 </ul>
291 </li>
292</ul>
293
Andreas Bollb5da52a2012-09-18 18:57:02 +0200294</div>
Andreas Bollecd5c7c2012-06-12 09:05:03 +0200295</body>
296</html>