blob: f603bd646602e05cc13ad8c007b46d368262a2db [file] [log] [blame]
Andreas Bollecd5c7c2012-06-12 09:05:03 +02001<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">
2<html lang="en">
3<head>
4 <meta http-equiv="content-type" content="text/html; charset=utf-8">
5 <title>llvmpipe</title>
6 <link rel="stylesheet" type="text/css" href="mesa.css">
7</head>
8<body>
Brian Paul0da2a222011-04-07 13:56:45 -06009
Andreas Bollb5da52a2012-09-18 18:57:02 +020010<div class="header">
11 <h1>The Mesa 3D Graphics Library</h1>
12</div>
13
14<iframe src="contents.html"></iframe>
15<div class="content">
16
Andreas Bollecd5c7c2012-06-12 09:05:03 +020017<h1>Introduction</h1>
Brian Paul0da2a222011-04-07 13:56:45 -060018
19<p>
20The Gallium llvmpipe driver is a software rasterizer that uses LLVM to
21do runtime code generation.
22Shaders, point/line/triangle rasterization and vertex processing are
23implemented with LLVM IR which is translated to x86 or x86-64 machine
24code.
25Also, the driver is multithreaded to take advantage of multiple CPU cores
26(up to 8 at this time).
27It's the fastest software rasterizer for Mesa.
28</p>
José Fonseca9285f152009-08-10 12:35:16 +010029
30
Brian Paul0da2a222011-04-07 13:56:45 -060031<h1>Requirements</h1>
José Fonseca9285f152009-08-10 12:35:16 +010032
José Fonseca65d0c842011-11-05 10:38:16 +000033<ul>
34<li>
Andreas Bollfd64b392012-06-12 09:05:49 +020035 <p>An x86 or amd64 processor; 64-bit mode recommended.</p>
Brian Paul0da2a222011-04-07 13:56:45 -060036 <p>
Matt Turner9f52b872011-11-05 17:11:59 -040037 Support for SSE2 is strongly encouraged. Support for SSSE3 and SSE4.1 will
38 yield the most efficient code. The fewer features the CPU has the more
Andreas Bollfd64b392012-06-12 09:05:49 +020039 likely is that you run into underperforming, buggy, or incomplete code.
Brian Paul0da2a222011-04-07 13:56:45 -060040 </p>
41 <p>
José Fonsecada1c4022009-11-26 11:15:08 +000042 See /proc/cpuinfo to know what your CPU supports.
Brian Paul0da2a222011-04-07 13:56:45 -060043 </p>
José Fonseca65d0c842011-11-05 10:38:16 +000044</li>
45<li>
José Fonsecaedb7b1c2014-11-07 14:39:00 +000046 <p>LLVM: version 3.4 recommended; 3.3 or later required.</p>
Brian Paul0da2a222011-04-07 13:56:45 -060047 <p>
José Fonseca12576552010-01-10 18:37:07 +000048 For Linux, on a recent Debian based distribution do:
Brian Paul0da2a222011-04-07 13:56:45 -060049 </p>
50<pre>
José Fonseca9285f152009-08-10 12:35:16 +010051 aptitude install llvm-dev
Brian Paul0da2a222011-04-07 13:56:45 -060052</pre>
Andreas Bolldf2be222012-06-12 09:05:22 +020053 <p>
Brian Paul0da2a222011-04-07 13:56:45 -060054 For a RPM-based distribution do:
55 </p>
56<pre>
57 yum install llvm-devel
58</pre>
José Fonseca19b31d02009-08-10 15:43:04 +010059
Brian Paul0da2a222011-04-07 13:56:45 -060060 <p>
Jose Fonseca36ceda42015-04-13 13:08:13 +010061 For Windows you will need to build LLVM from source with MSVC or MINGW
62 (either natively or through cross compilers) and CMake, and set the LLVM
63 environment variable to the directory you installed it to.
José Fonseca19b31d02009-08-10 15:43:04 +010064
José Fonseca65d0c842011-11-05 10:38:16 +000065 LLVM will be statically linked, so when building on MSVC it needs to be
66 built with a matching CRT as Mesa, and you'll need to pass
Jose Fonseca36ceda42015-04-13 13:08:13 +010067 <code>-DLLVM_USE_CRT_xxx=yyy</code> as described below.
68 </p>
José Fonsecaf379e7d2010-05-13 16:18:05 +010069
Jose Fonseca36ceda42015-04-13 13:08:13 +010070 <table border="1">
71 <tr>
72 <th rowspan="2">LLVM build-type</th>
73 <th colspan="2" align="center">Mesa build-type</th>
74 </tr>
75 <tr>
76 <th>debug,checked</th>
77 <th>release,profile</th>
78 </tr>
79 <tr>
80 <th>Debug</th>
81 <td><code>-DLLVM_USE_CRT_DEBUG=MTd</code></td>
82 <td><code>-DLLVM_USE_CRT_DEBUG=MT</code></td>
83 </tr>
84 <tr>
85 <th>Release</th>
86 <td><code>-DLLVM_USE_CRT_RELEASE=MTd</code></td>
87 <td><code>-DLLVM_USE_CRT_RELEASE=MT</code></td>
88 </tr>
89 </table>
90
91 <p>
José Fonseca65d0c842011-11-05 10:38:16 +000092 You can build only the x86 target by passing -DLLVM_TARGETS_TO_BUILD=X86
93 to cmake.
Brian Paul0da2a222011-04-07 13:56:45 -060094 </p>
José Fonseca65d0c842011-11-05 10:38:16 +000095</li>
José Fonsecaf379e7d2010-05-13 16:18:05 +010096
José Fonseca65d0c842011-11-05 10:38:16 +000097<li>
98 <p>scons (optional)</p>
99</li>
100</ul>
101
José Fonseca9285f152009-08-10 12:35:16 +0100102
Brian Paul0da2a222011-04-07 13:56:45 -0600103<h1>Building</h1>
José Fonseca9285f152009-08-10 12:35:16 +0100104
José Fonseca12576552010-01-10 18:37:07 +0000105To build everything on Linux invoke scons as:
José Fonseca9285f152009-08-10 12:35:16 +0100106
Brian Paul0da2a222011-04-07 13:56:45 -0600107<pre>
José Fonseca601498a2010-11-01 13:30:22 +0000108 scons build=debug libgl-xlib
Brian Paul0da2a222011-04-07 13:56:45 -0600109</pre>
José Fonseca9285f152009-08-10 12:35:16 +0100110
José Fonseca5811ed82009-08-22 22:26:55 +0100111Alternatively, you can build it with GNU make, if you prefer, by invoking it as
112
Brian Paul0da2a222011-04-07 13:56:45 -0600113<pre>
José Fonseca5811ed82009-08-22 22:26:55 +0100114 make linux-llvm
Brian Paul0da2a222011-04-07 13:56:45 -0600115</pre>
José Fonseca5811ed82009-08-22 22:26:55 +0100116
José Fonseca1fc41002009-09-11 11:24:00 +0100117but the rest of these instructions assume that scons is used.
José Fonseca5811ed82009-08-22 22:26:55 +0100118
José Fonseca65d0c842011-11-05 10:38:16 +0000119For Windows the procedure is similar except the target:
José Fonseca12576552010-01-10 18:37:07 +0000120
Brian Paul0da2a222011-04-07 13:56:45 -0600121<pre>
José Fonseca76bf4bd2014-05-29 20:02:31 +0100122 scons platform=windows build=debug libgl-gdi
Brian Paul0da2a222011-04-07 13:56:45 -0600123</pre>
José Fonseca9285f152009-08-10 12:35:16 +0100124
Brian Paul0da2a222011-04-07 13:56:45 -0600125
126<h1>Using</h1>
José Fonseca9285f152009-08-10 12:35:16 +0100127
José Fonseca76bf4bd2014-05-29 20:02:31 +0100128<h2>Linux</h2>
129
130<p>On Linux, building will create a drop-in alternative for libGL.so into</p>
José Fonseca9285f152009-08-10 12:35:16 +0100131
Brian Paul0da2a222011-04-07 13:56:45 -0600132<pre>
José Fonseca601498a2010-11-01 13:30:22 +0000133 build/foo/gallium/targets/libgl-xlib/libGL.so
Brian Paul0da2a222011-04-07 13:56:45 -0600134</pre>
135or
136<pre>
137 lib/gallium/libGL.so
138</pre>
José Fonseca9285f152009-08-10 12:35:16 +0100139
José Fonseca76bf4bd2014-05-29 20:02:31 +0100140<p>To use it set the LD_LIBRARY_PATH environment variable accordingly.</p>
José Fonseca5811ed82009-08-22 22:26:55 +0100141
José Fonseca76bf4bd2014-05-29 20:02:31 +0100142<p>For performance evaluation pass build=release to scons, and use the corresponding
143lib directory without the "-debug" suffix.</p>
José Fonseca1fc41002009-09-11 11:24:00 +0100144
José Fonseca76bf4bd2014-05-29 20:02:31 +0100145
146<h2>Windows</h2>
147
148<p>
149On Windows, building will create
150<code>build/windows-x86-debug/gallium/targets/libgl-gdi/opengl32.dll</code>
151which is a drop-in alternative for system's <code>opengl32.dll</code>. To use
152it put it in the same directory as your application. It can also be used by
José Fonseca12576552010-01-10 18:37:07 +0000153replacing the native ICD driver, but it's quite an advanced usage, so if you
154need to ask, don't even try it.
José Fonseca76bf4bd2014-05-29 20:02:31 +0100155</p>
156
157<p>
158There is however an easy way to replace the OpenGL software renderer that comes
159with Microsoft Windows 7 (or later) with llvmpipe (that is, on systems without
160any OpenGL drivers):
161</p>
162
163<ul>
164 <li><p>copy build/windows-x86-debug/gallium/targets/libgl-gdi/opengl32.dll to C:\Windows\SysWOW64\mesadrv.dll</p></li>
165 <li><p>load this registry settings:</p>
166 <pre>REGEDIT4
167
168; http://technet.microsoft.com/en-us/library/cc749368.aspx
169; http://www.msfn.org/board/topic/143241-portable-windows-7-build-from-winpe-30/page-5#entry942596
170[HKEY_LOCAL_MACHINE\SOFTWARE\Wow6432Node\Microsoft\Windows NT\CurrentVersion\OpenGLDrivers\MSOGL]
171"DLL"="mesadrv.dll"
172"DriverVersion"=dword:00000001
173"Flags"=dword:00000001
174"Version"=dword:00000002
175</pre>
176 </li>
177 <li>Ditto for 64 bits drivers if you need them.</li>
178</ul>
José Fonseca12576552010-01-10 18:37:07 +0000179
José Fonseca9285f152009-08-10 12:35:16 +0100180
Brian Paul0da2a222011-04-07 13:56:45 -0600181<h1>Profiling</h1>
José Fonseca388c9412010-09-21 17:50:30 +0100182
José Fonsecab8f68582013-04-17 13:32:15 +0100183<p>
184To profile llvmpipe you should build as
185</p>
Brian Paul0da2a222011-04-07 13:56:45 -0600186<pre>
Andreas Boll703a6622012-06-12 09:05:15 +0200187 scons build=profile &lt;same-as-before&gt;
Brian Paul0da2a222011-04-07 13:56:45 -0600188</pre>
José Fonseca388c9412010-09-21 17:50:30 +0100189
José Fonsecab8f68582013-04-17 13:32:15 +0100190<p>
José Fonseca388c9412010-09-21 17:50:30 +0100191This will ensure that frame pointers are used both in C and JIT functions, and
192that no tail call optimizations are done by gcc.
José Fonsecab8f68582013-04-17 13:32:15 +0100193</p>
José Fonseca388c9412010-09-21 17:50:30 +0100194
José Fonsecab8f68582013-04-17 13:32:15 +0100195<h2>Linux perf integration</h2>
196
197<p>
198On Linux, it is possible to have symbol resolution of JIT code with <a href="http://perf.wiki.kernel.org/">Linux perf</a>:
199</p>
José Fonseca388c9412010-09-21 17:50:30 +0100200
Brian Paul0da2a222011-04-07 13:56:45 -0600201<pre>
José Fonsecab8f68582013-04-17 13:32:15 +0100202 perf record -g /my/application
203 perf report
Brian Paul0da2a222011-04-07 13:56:45 -0600204</pre>
José Fonseca388c9412010-09-21 17:50:30 +0100205
José Fonsecab8f68582013-04-17 13:32:15 +0100206<p>
207When run inside Linux perf, llvmpipe will create a /tmp/perf-XXXXX.map file with
208symbol address table. It also dumps assembly code to /tmp/perf-XXXXX.map.asm,
209which can be used by the bin/perf-annotate-jit script to produce disassembly of
210the generated code annotated with the samples.
211</p>
José Fonseca388c9412010-09-21 17:50:30 +0100212
José Fonsecab8f68582013-04-17 13:32:15 +0100213<p>You can obtain a call graph via
214<a href="http://code.google.com/p/jrfonseca/wiki/Gprof2Dot#linux_perf">Gprof2Dot</a>.</p>
José Fonseca388c9412010-09-21 17:50:30 +0100215
216
Brian Paul0da2a222011-04-07 13:56:45 -0600217<h1>Unit testing</h1>
José Fonseca9285f152009-08-10 12:35:16 +0100218
Brian Paul0da2a222011-04-07 13:56:45 -0600219<p>
José Fonseca9285f152009-08-10 12:35:16 +0100220Building will also create several unit tests in
221build/linux-???-debug/gallium/drivers/llvmpipe:
Brian Paul0da2a222011-04-07 13:56:45 -0600222</p>
José Fonseca9285f152009-08-10 12:35:16 +0100223
Andreas Bolldf2be222012-06-12 09:05:22 +0200224<ul>
Brian Paul0da2a222011-04-07 13:56:45 -0600225<li> lp_test_blend: blending
226<li> lp_test_conv: SIMD vector conversion
227<li> lp_test_format: pixel unpacking/packing
228</ul>
José Fonseca9285f152009-08-10 12:35:16 +0100229
Brian Paul0da2a222011-04-07 13:56:45 -0600230<p>
José Fonseca1fc41002009-09-11 11:24:00 +0100231Some of this tests can output results and benchmarks to a tab-separated-file
José Fonseca89146cd2009-08-20 10:21:49 +0100232for posterior analysis, e.g.:
Brian Paul0da2a222011-04-07 13:56:45 -0600233</p>
234<pre>
José Fonseca5811ed82009-08-22 22:26:55 +0100235 build/linux-x86_64-debug/gallium/drivers/llvmpipe/lp_test_blend -o blend.tsv
Brian Paul0da2a222011-04-07 13:56:45 -0600236</pre>
José Fonseca9285f152009-08-10 12:35:16 +0100237
José Fonsecac5531f52009-08-21 10:57:48 +0100238
Brian Paul0da2a222011-04-07 13:56:45 -0600239<h1>Development Notes</h1>
José Fonsecac5531f52009-08-21 10:57:48 +0100240
Brian Paul0da2a222011-04-07 13:56:45 -0600241<ul>
242<li>
243 When looking to this code by the first time start in lp_state_fs.c, and
José Fonseca5811ed82009-08-22 22:26:55 +0100244 then skim through the lp_bld_* functions called in there, and the comments
Andreas Bollfd64b392012-06-12 09:05:49 +0200245 at the top of the lp_bld_*.c functions.
Brian Paul0da2a222011-04-07 13:56:45 -0600246</li>
247<li>
248 The driver-independent parts of the LLVM / Gallium code are found in
Brian Pauld0b35352010-03-15 11:46:41 -0600249 src/gallium/auxiliary/gallivm/. The filenames and function prefixes
250 need to be renamed from "lp_bld_" to something else though.
Brian Paul0da2a222011-04-07 13:56:45 -0600251</li>
252<li>
253 We use LLVM-C bindings for now. They are not documented, but follow the C++
José Fonsecac5531f52009-08-21 10:57:48 +0100254 interfaces very closely, and appear to be complete enough for code
255 generation. See
José Fonseca68b696e2013-11-21 17:52:50 +0000256 <a href="http://npcontemplation.blogspot.com/2008/06/secret-of-llvm-c-bindings.html">
257 this stand-alone example</a>. See the llvm-c/Core.h file for reference.
Brian Paul0da2a222011-04-07 13:56:45 -0600258</li>
259</ul>
Andreas Bollecd5c7c2012-06-12 09:05:03 +0200260
José Fonseca68b696e2013-11-21 17:52:50 +0000261<h1 id="recommended_reading">Recommended Reading</h1>
262
263<ul>
264 <li>
265 <p>Rasterization</p>
266 <ul>
267 <li><a href="http://www.cs.unc.edu/~olano/papers/2dh-tri/">Triangle Scan Conversion using 2D Homogeneous Coordinates</a></li>
268 <li><a href="http://www.drdobbs.com/parallel/rasterization-on-larrabee/217200602">Rasterization on Larrabee</a> (<a href="http://devmaster.net/posts/2887/rasterization-on-larrabee">DevMaster copy</a>)</li>
269 <li><a href="http://devmaster.net/posts/6133/rasterization-using-half-space-functions">Rasterization using half-space functions</a></li>
270 <li><a href="http://devmaster.net/posts/6145/advanced-rasterization">Advanced Rasterization</a></li>
271 <li><a href="http://fgiesen.wordpress.com/2013/02/17/optimizing-sw-occlusion-culling-index/">Optimizing Software Occlusion Culling</a></li>
272 </ul>
273 </li>
274 <li>
275 <p>Texture sampling</p>
276 <ul>
277 <li><a href="http://chrishecker.com/Miscellaneous_Technical_Articles#Perspective_Texture_Mapping">Perspective Texture Mapping</a></li>
278 <li><a href="http://www.flipcode.com/archives/Texturing_As_In_Unreal.shtml">Texturing As In Unreal</a></li>
279 <li><a href="http://www.gamasutra.com/view/feature/3301/runtime_mipmap_filtering.php">Run-Time MIP-Map Filtering</a></li>
280 <li><a href="http://alt.3dcenter.org/artikel/2003/10-26_a_english.php">Will "brilinear" filtering persist?</a></li>
281 <li><a href="http://ixbtlabs.com/articles2/gffx/nv40-rx800-3.html">Trilinear filtering</a></li>
282 <li><a href="http://devmaster.net/posts/12785/texture-swizzling">Texture Swizzling</a></li>
283 </ul>
284 </li>
285 <li>
286 <p>SIMD</p>
287 <ul>
288 <li><a href="http://www.cdl.uni-saarland.de/projects/wfv/#header4">Whole-Function Vectorization</a></li>
289 </ul>
290 </li>
291 <li>
292 <p>Optimization</p>
293 <ul>
294 <li><a href="http://www.drdobbs.com/optimizing-pixomatic-for-modern-x86-proc/184405807">Optimizing Pixomatic For Modern x86 Processors</a></li>
295 <li><a href="http://www.intel.com/content/www/us/en/architecture-and-technology/64-ia-32-architectures-optimization-manual.html">Intel 64 and IA-32 Architectures Optimization Reference Manual</a></li>
296 <li><a href="http://www.agner.org/optimize/">Software optimization resources</a></li>
297 <li><a href="http://software.intel.com/en-us/articles/intel-intrinsics-guide">Intel Intrinsics Guide</a><li>
298 </ul>
299 </li>
300 <li>
301 <p>LLVM</p>
302 <ul>
303 <li><a href="http://llvm.org/docs/LangRef.html">LLVM Language Reference Manual</a></li>
304 <li><a href="http://npcontemplation.blogspot.co.uk/2008/06/secret-of-llvm-c-bindings.html">The secret of LLVM C bindings</a></li>
305 </ul>
306 </li>
307 <li>
José Fonsecaeb0892b2013-11-25 08:28:23 +0000308 <p>General</p>
José Fonseca68b696e2013-11-21 17:52:50 +0000309 <ul>
José Fonsecaeb0892b2013-11-25 08:28:23 +0000310 <li><a href="http://fgiesen.wordpress.com/2011/07/09/a-trip-through-the-graphics-pipeline-2011-index/">A trip through the Graphics Pipeline</a></li>
José Fonseca68b696e2013-11-21 17:52:50 +0000311 <li><a href="http://msdn.microsoft.com/en-us/library/gg615082.aspx#architecture">WARP Architecture and Performance</a></li>
312 </ul>
313 </li>
314</ul>
315
Andreas Bollb5da52a2012-09-18 18:57:02 +0200316</div>
Andreas Bollecd5c7c2012-06-12 09:05:03 +0200317</body>
318</html>