blob: 3909fb616bf1d680412f8603c0c967426a15505b [file] [log] [blame]
Chris Lattner0d8c2db2004-05-23 21:02:20 +00001<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01//EN"
2 "http://www.w3.org/TR/html4/strict.dtd">
3<html>
4<head>
Gordon Henriksen326e24f2007-09-27 19:31:36 +00005 <meta http-equiv="Content-Type" Content="text/html; charset=UTF-8" >
Chris Lattner0d8c2db2004-05-23 21:02:20 +00006 <title>Accurate Garbage Collection with LLVM</title>
7 <link rel="stylesheet" href="llvm.css" type="text/css">
Gordon Henriksen326e24f2007-09-27 19:31:36 +00008 <style type="text/css">
9 .rowhead { text-align: left; background: inherit; }
10 .indent { padding-left: 1em; }
11 .optl { color: #BFBFBF; }
12 </style>
Chris Lattner0d8c2db2004-05-23 21:02:20 +000013</head>
14<body>
15
16<div class="doc_title">
17 Accurate Garbage Collection with LLVM
18</div>
19
20<ol>
21 <li><a href="#introduction">Introduction</a>
22 <ul>
Gordon Henriksen326e24f2007-09-27 19:31:36 +000023 <li><a href="#feature">GC features provided and algorithms
24 supported</a></li>
Chris Lattner0d8c2db2004-05-23 21:02:20 +000025 </ul>
26 </li>
27
Gordon Henriksen326e24f2007-09-27 19:31:36 +000028 <li><a href="#usage">Using the collectors</a>
Chris Lattner0d8c2db2004-05-23 21:02:20 +000029 <ul>
Gordon Henriksen326e24f2007-09-27 19:31:36 +000030 <li><a href="#shadow-stack">ShadowStack -
31 A highly portable collector</a></li>
32 <li><a href="#semispace">SemiSpace -
33 A simple copying collector runtime</a></li>
34 <li><a href="#ocaml">Ocaml -
35 An Objective Caml-compatible collector</a></li>
36 </ul>
37 </li>
38
Gordon Henriksenad93c4f2007-12-11 00:30:17 +000039 <li><a href="#core">Core support</a>
Gordon Henriksen326e24f2007-09-27 19:31:36 +000040 <ul>
Gordon Henriksenad93c4f2007-12-11 00:30:17 +000041 <li><a href="#gcattr">Specifying GC code generation:
42 <tt>gc "..."</tt></a></li>
Gordon Henriksen326e24f2007-09-27 19:31:36 +000043 <li><a href="#gcroot">Identifying GC roots on the stack:
44 <tt>llvm.gcroot</tt></a></li>
45 <li><a href="#barriers">Reading and writing references in the heap</a>
46 <ul>
47 <li><a href="#gcwrite">Write barrier: <tt>llvm.gcwrite</tt></a></li>
48 <li><a href="#gcread">Read barrier: <tt>llvm.gcread</tt></a></li>
49 </ul>
50 </li>
51 </ul>
52 </li>
53
54 <li><a href="#runtime">Recommended runtime interface</a>
55 <ul>
56 <li><a href="#initialize">Garbage collector startup and
57 initialization</a></li>
Chris Lattner0d8c2db2004-05-23 21:02:20 +000058 <li><a href="#allocate">Allocating memory from the GC</a></li>
Gordon Henriksen326e24f2007-09-27 19:31:36 +000059 <li><a href="#explicit">Explicit invocation of the garbage
60 collector</a></li>
61 <li><a href="#traceroots">Tracing GC pointers from the program
62 stack</a></li>
63 <li><a href="#staticroots">Tracing GC pointers from static roots</a></li>
Chris Lattner0d8c2db2004-05-23 21:02:20 +000064 </ul>
65 </li>
66
Gordon Henriksen326e24f2007-09-27 19:31:36 +000067 <li><a href="#plugin">Implementing a collector plugin</a>
Chris Lattner0d8c2db2004-05-23 21:02:20 +000068 <ul>
Gordon Henriksen326e24f2007-09-27 19:31:36 +000069 <li><a href="#collector-algos">Overview of available features</a></li>
70 <li><a href="#stack-map">Computing stack maps</a></li>
71 <li><a href="#init-roots">Initializing roots to null:
72 <tt>InitRoots</tt></a></li>
73 <li><a href="#custom">Custom lowering of intrinsics: <tt>CustomRoots</tt>,
74 <tt>CustomReadBarriers</tt>, and <tt>CustomWriteBarriers</tt></a></li>
75 <li><a href="#safe-points">Generating safe points:
76 <tt>NeededSafePoints</tt></a></li>
77 <li><a href="#assembly">Emitting assembly code:
Gordon Henriksen01571ef2008-08-24 03:18:23 +000078 <tt>GCMetadataPrinter</tt></a></li>
Chris Lattner0b02dbc2004-07-09 05:03:54 +000079 </ul>
Chris Lattner9b2a1842004-05-27 05:52:10 +000080 </li>
Chris Lattner0d8c2db2004-05-23 21:02:20 +000081
Gordon Henriksen326e24f2007-09-27 19:31:36 +000082 <li><a href="#runtime-impl">Implementing a collector runtime</a>
83 <ul>
84 <li><a href="#gcdescriptors">Tracing GC pointers from heap
85 objects</a></li>
86 </ul>
87 </li>
88
89 <li><a href="#references">References</a></li>
90
Chris Lattner0d8c2db2004-05-23 21:02:20 +000091</ol>
92
93<div class="doc_author">
Gordon Henriksen326e24f2007-09-27 19:31:36 +000094 <p>Written by <a href="mailto:sabre@nondot.org">Chris Lattner</a> and
95 Gordon Henriksen</p>
Chris Lattner0d8c2db2004-05-23 21:02:20 +000096</div>
97
98<!-- *********************************************************************** -->
99<div class="doc_section">
100 <a name="introduction">Introduction</a>
101</div>
102<!-- *********************************************************************** -->
103
104<div class="doc_text">
105
106<p>Garbage collection is a widely used technique that frees the programmer from
Gordon Henriksen326e24f2007-09-27 19:31:36 +0000107having to know the lifetimes of heap objects, making software easier to produce
108and maintain. Many programming languages rely on garbage collection for
109automatic memory management. There are two primary forms of garbage collection:
Chris Lattner0d8c2db2004-05-23 21:02:20 +0000110conservative and accurate.</p>
111
112<p>Conservative garbage collection often does not require any special support
113from either the language or the compiler: it can handle non-type-safe
114programming languages (such as C/C++) and does not require any special
Gordon Henriksen326e24f2007-09-27 19:31:36 +0000115information from the compiler. The
Jeff Cohen65fc36b2007-04-18 17:26:14 +0000116<a href="http://www.hpl.hp.com/personal/Hans_Boehm/gc/">Boehm collector</a> is
117an example of a state-of-the-art conservative collector.</p>
Chris Lattner0d8c2db2004-05-23 21:02:20 +0000118
119<p>Accurate garbage collection requires the ability to identify all pointers in
120the program at run-time (which requires that the source-language be type-safe in
Gordon Henriksen326e24f2007-09-27 19:31:36 +0000121most cases). Identifying pointers at run-time requires compiler support to
Chris Lattner0d8c2db2004-05-23 21:02:20 +0000122locate all places that hold live pointer variables at run-time, including the
Gordon Henriksen326e24f2007-09-27 19:31:36 +0000123<a href="#gcroot">processor stack and registers</a>.</p>
Chris Lattner0d8c2db2004-05-23 21:02:20 +0000124
Gordon Henriksen326e24f2007-09-27 19:31:36 +0000125<p>Conservative garbage collection is attractive because it does not require any
126special compiler support, but it does have problems. In particular, because the
Chris Lattner0d8c2db2004-05-23 21:02:20 +0000127conservative garbage collector cannot <i>know</i> that a particular word in the
128machine is a pointer, it cannot move live objects in the heap (preventing the
129use of compacting and generational GC algorithms) and it can occasionally suffer
130from memory leaks due to integer values that happen to point to objects in the
Gordon Henriksen326e24f2007-09-27 19:31:36 +0000131program. In addition, some aggressive compiler transformations can break
132conservative garbage collectors (though these seem rare in practice).</p>
Chris Lattner0d8c2db2004-05-23 21:02:20 +0000133
Gordon Henriksen326e24f2007-09-27 19:31:36 +0000134<p>Accurate garbage collectors do not suffer from any of these problems, but
135they can suffer from degraded scalar optimization of the program. In particular,
Chris Lattner0d8c2db2004-05-23 21:02:20 +0000136because the runtime must be able to identify and update all pointers active in
Gordon Henriksen326e24f2007-09-27 19:31:36 +0000137the program, some optimizations are less effective. In practice, however, the
Chris Lattner0d8c2db2004-05-23 21:02:20 +0000138locality and performance benefits of using aggressive garbage allocation
Gordon Henriksen326e24f2007-09-27 19:31:36 +0000139techniques dominates any low-level losses.</p>
Chris Lattner0d8c2db2004-05-23 21:02:20 +0000140
Gordon Henriksen326e24f2007-09-27 19:31:36 +0000141<p>This document describes the mechanisms and interfaces provided by LLVM to
142support accurate garbage collection.</p>
Chris Lattner0d8c2db2004-05-23 21:02:20 +0000143
144</div>
145
146<!-- ======================================================================= -->
147<div class="doc_subsection">
148 <a name="feature">GC features provided and algorithms supported</a>
149</div>
150
151<div class="doc_text">
152
Gordon Henriksen326e24f2007-09-27 19:31:36 +0000153<p>LLVM's intermediate representation provides <a href="#intrinsics">garbage
Chris Lattner05d67092008-04-24 05:59:56 +0000154collection intrinsics</a> that offer support for a broad class of
Gordon Henriksen326e24f2007-09-27 19:31:36 +0000155collector models. For instance, the intrinsics permit:</p>
Chris Lattner0d8c2db2004-05-23 21:02:20 +0000156
Gordon Henriksen326e24f2007-09-27 19:31:36 +0000157<ul>
158 <li>semi-space collectors</li>
159 <li>mark-sweep collectors</li>
160 <li>generational collectors</li>
161 <li>reference counting</li>
162 <li>incremental collectors</li>
163 <li>concurrent collectors</li>
164 <li>cooperative collectors</li>
165</ul>
Chris Lattner0d8c2db2004-05-23 21:02:20 +0000166
Gordon Henriksen326e24f2007-09-27 19:31:36 +0000167<p>We hope that the primitive support built into the LLVM IR is sufficient to
168support a broad class of garbage collected languages including Scheme, ML, Java,
169C#, Perl, Python, Lua, Ruby, other scripting languages, and more.</p>
170
171<p>However, LLVM does not itself implement a garbage collector. This is because
172collectors are tightly coupled to object models, and LLVM is agnostic to object
173models. Since LLVM is agnostic to object models, it would be inappropriate for
174LLVM to dictate any particular collector. Instead, LLVM provides a framework for
175garbage collector implementations in two manners:</p>
176
177<ul>
178 <li><b>At compile time</b> with <a href="#plugin">collector plugins</a> for
179 the compiler. Collector plugins have ready access to important garbage
180 collector algorithms. Leveraging these tools, it is straightforward to
181 emit type-accurate stack maps for your runtime in as little as ~100 lines of
182 C++ code.</li>
183
184 <li><b>At runtime</b> with <a href="#runtime">suggested runtime
185 interfaces</a>, which allow front-end compilers to support a range of
186 collection runtimes.</li>
187</ul>
Chris Lattner0d8c2db2004-05-23 21:02:20 +0000188
189</div>
190
191<!-- *********************************************************************** -->
192<div class="doc_section">
Gordon Henriksen326e24f2007-09-27 19:31:36 +0000193 <a name="usage">Using the collectors</a>
Chris Lattner0d8c2db2004-05-23 21:02:20 +0000194</div>
195<!-- *********************************************************************** -->
196
197<div class="doc_text">
198
Gordon Henriksen326e24f2007-09-27 19:31:36 +0000199<p>In general, using a collector implies:</p>
200
201<ul>
202 <li>Emitting compatible code, including initialization in the main
Gordon Henriksenad93c4f2007-12-11 00:30:17 +0000203 program if necessary.</li>
Gordon Henriksen326e24f2007-09-27 19:31:36 +0000204 <li>Loading a compiler plugin if the collector is not statically linked with
205 your compiler. For <tt>llc</tt>, use the <tt>-load</tt> option.</li>
Gordon Henriksenad93c4f2007-12-11 00:30:17 +0000206 <li>Selecting the collection algorithm by applying the <tt>gc "..."</tt>
207 attribute to your garbage collected functions, or equivalently with
Gordon Henriksen01571ef2008-08-24 03:18:23 +0000208 the <tt>setGC</tt> method.</li>
Gordon Henriksen326e24f2007-09-27 19:31:36 +0000209 <li>Linking your final executable with the garbage collector runtime.</li>
210</ul>
211
212<p>This table summarizes the available runtimes.</p>
213
214<table>
215 <tr>
216 <th>Collector</th>
Gordon Henriksenad93c4f2007-12-11 00:30:17 +0000217 <th><tt>gc</tt> attribute</th>
Gordon Henriksen326e24f2007-09-27 19:31:36 +0000218 <th>Linkage</th>
219 <th><tt>gcroot</tt></th>
220 <th><tt>gcread</tt></th>
221 <th><tt>gcwrite</tt></th>
222 </tr>
223 <tr valign="baseline">
224 <td><a href="#semispace">SemiSpace</a></td>
Gordon Henriksenad93c4f2007-12-11 00:30:17 +0000225 <td><tt>gc "shadow-stack"</tt></td>
Gordon Henriksen326e24f2007-09-27 19:31:36 +0000226 <td>TODO FIXME</td>
227 <td>required</td>
228 <td>optional</td>
229 <td>optional</td>
230 </tr>
231 <tr valign="baseline">
232 <td><a href="#ocaml">Ocaml</a></td>
Gordon Henriksenad93c4f2007-12-11 00:30:17 +0000233 <td><tt>gc "ocaml"</tt></td>
Gordon Henriksen326e24f2007-09-27 19:31:36 +0000234 <td><i>provided by ocamlopt</i></td>
235 <td>required</td>
236 <td>optional</td>
237 <td>optional</td>
238 </tr>
239</table>
240
241<p>The sections for <a href="#intrinsics">Collection intrinsics</a> and
242<a href="#runtime">Recommended runtime interface</a> detail the interfaces that
243collectors may require user programs to utilize.</p>
Chris Lattner0d8c2db2004-05-23 21:02:20 +0000244
245</div>
246
247<!-- ======================================================================= -->
248<div class="doc_subsection">
Gordon Henriksen326e24f2007-09-27 19:31:36 +0000249 <a name="shadow-stack">ShadowStack - A highly portable collector</a>
250</div>
251
252<div class="doc_code"><tt>
253 Collector *llvm::createShadowStackCollector();
254</tt></div>
255
256<div class="doc_text">
257
Gordon Henriksenad93c4f2007-12-11 00:30:17 +0000258<p>The ShadowStack backend is invoked with the <tt>gc "shadow-stack"</tt>
259function attribute.
Gordon Henriksen326e24f2007-09-27 19:31:36 +0000260Unlike many collectors which rely on a cooperative code generator to generate
261stack maps, this algorithm carefully maintains a linked list of stack root
262descriptors [<a href="#henderson02">Henderson2002</a>]. This so-called "shadow
Gordon Henriksenad93c4f2007-12-11 00:30:17 +0000263stack" mirrors the machine stack. Maintaining this data structure is slower
Gordon Henriksen326e24f2007-09-27 19:31:36 +0000264than using stack maps, but has a significant portability advantage because it
265requires no special support from the target code generator.</p>
266
267<p>The ShadowStack collector does not use read or write barriers, so the user
268program may use <tt>load</tt> and <tt>store</tt> instead of <tt>llvm.gcread</tt>
269and <tt>llvm.gcwrite</tt>.</p>
270
Gordon Henriksenad93c4f2007-12-11 00:30:17 +0000271<p>ShadowStack is a code generator plugin only. It must be paired with a
Gordon Henriksen326e24f2007-09-27 19:31:36 +0000272compatible runtime.</p>
273
274</div>
275
276<!-- ======================================================================= -->
277<div class="doc_subsection">
278 <a name="semispace">SemiSpace - A simple copying collector runtime</a>
Chris Lattner0d8c2db2004-05-23 21:02:20 +0000279</div>
280
281<div class="doc_text">
282
Chris Lattner05d67092008-04-24 05:59:56 +0000283<p>The SemiSpace runtime implements the <a href="runtime">suggested
284runtime interface</a> and is compatible with the ShadowStack backend.</p>
Gordon Henriksen326e24f2007-09-27 19:31:36 +0000285
286<p>SemiSpace is a very simple copying collector. When it starts up, it
287allocates two blocks of memory for the heap. It uses a simple bump-pointer
288allocator to allocate memory from the first block until it runs out of space.
289When it runs out of space, it traces through all of the roots of the program,
290copying blocks to the other half of the memory space.</p>
291
292<p>This runtime is highly experimental and has not been used in a real project.
293Enhancements would be welcomed.</p>
294
295</div>
296
297<!-- ======================================================================= -->
298<div class="doc_subsection">
299 <a name="ocaml">Ocaml - An Objective Caml-compatible collector</a>
300</div>
301
302<div class="doc_code"><tt>
303 Collector *llvm::createOcamlCollector();
304</tt></div>
305
306<div class="doc_text">
307
Gordon Henriksenad93c4f2007-12-11 00:30:17 +0000308<p>The ocaml backend is invoked with the <tt>gc "ocaml"</tt> function attribute.
309It supports the
Gordon Henriksen326e24f2007-09-27 19:31:36 +0000310<a href="http://caml.inria.fr/">Objective Caml</a> language runtime by emitting
311a type-accurate stack map in the form of an ocaml 3.10.0-compatible frametable.
312The linkage requirements are satisfied automatically by the <tt>ocamlopt</tt>
313compiler when linking an executable.</p>
314
315<p>The ocaml collector does not use read or write barriers, so the user program
316may use <tt>load</tt> and <tt>store</tt> instead of <tt>llvm.gcread</tt> and
317<tt>llvm.gcwrite</tt>.</p>
318
319</div>
320
321
322<!-- *********************************************************************** -->
323<div class="doc_section">
Chris Lattner05d67092008-04-24 05:59:56 +0000324 <a name="core">Core support</a><a name="intrinsics"></a>
Gordon Henriksen326e24f2007-09-27 19:31:36 +0000325</div>
326<!-- *********************************************************************** -->
327
328<div class="doc_text">
329
330<p>This section describes the garbage collection facilities provided by the
331<a href="LangRef.html">LLVM intermediate representation</a>.</p>
332
333<p>These facilities are limited to those strictly necessary for compilation.
334They are not intended to be a complete interface to any garbage collector.
335Notably, heap allocation is not among the supplied primitives. A user program
336will also need to interface with the runtime, using either the
337<a href="#runtime">suggested runtime interface</a> or another interface
338specified by the runtime.</p>
339
340</div>
341
342<!-- ======================================================================= -->
343<div class="doc_subsection">
Gordon Henriksenad93c4f2007-12-11 00:30:17 +0000344 <a name="gcattr">Specifying GC code generation: <tt>gc "..."</tt></a>
345</div>
346
347<div class="doc_code"><tt>
348 define <i>ty</i> @<i>name</i>(...) <u>gc "<i>collector</i>"</u> { ...
349</tt></div>
350
351<div class="doc_text">
352
353<p>The <tt>gc</tt> function attribute is used to specify the desired collector
Chris Lattner05d67092008-04-24 05:59:56 +0000354algorithm to the compiler. It is equivalent to specifying the collector name
Gordon Henriksen01571ef2008-08-24 03:18:23 +0000355programmatically using the <tt>setGC</tt> method of <tt>Function</tt>.</p>
Gordon Henriksenad93c4f2007-12-11 00:30:17 +0000356
357<p>Specifying the collector on a per-function basis allows LLVM to link together
Chris Lattner05d67092008-04-24 05:59:56 +0000358programs that use different garbage collection algorithms.</p>
Gordon Henriksenad93c4f2007-12-11 00:30:17 +0000359
360</div>
361
362<!-- ======================================================================= -->
363<div class="doc_subsection">
Gordon Henriksen326e24f2007-09-27 19:31:36 +0000364 <a name="gcroot">Identifying GC roots on the stack: <tt>llvm.gcroot</tt></a>
365</div>
366
Chris Lattner0d8c2db2004-05-23 21:02:20 +0000367<div class="doc_code"><tt>
Chris Lattner17340552008-04-24 06:00:30 +0000368 void @llvm.gcroot(i8** %ptrloc, i8* %metadata)
Chris Lattner0d8c2db2004-05-23 21:02:20 +0000369</tt></div>
370
Gordon Henriksen326e24f2007-09-27 19:31:36 +0000371<div class="doc_text">
Chris Lattner0d8c2db2004-05-23 21:02:20 +0000372
Gordon Henriksen326e24f2007-09-27 19:31:36 +0000373<p>The <tt>llvm.gcroot</tt> intrinsic is used to inform LLVM of a pointer
Chris Lattner05d67092008-04-24 05:59:56 +0000374variable on the stack. The first argument <b>must</b> be a value referring to an alloca instruction
Gordon Henriksen326e24f2007-09-27 19:31:36 +0000375or a bitcast of an alloca. The second contains a pointer to metadata that
376should be associated with the pointer, and <b>must</b> be a constant or global
377value address. If your target collector uses tags, use a null pointer for
378metadata.</p>
379
380<p>Consider the following fragment of Java code:</p>
Chris Lattner0d8c2db2004-05-23 21:02:20 +0000381
382<pre>
383 {
384 Object X; // A null-initialized reference to an object
385 ...
386 }
387</pre>
388
Gordon Henriksen326e24f2007-09-27 19:31:36 +0000389<p>This block (which may be located in the middle of a function or in a loop
390nest), could be compiled to this LLVM code:</p>
Chris Lattner0d8c2db2004-05-23 21:02:20 +0000391
392<pre>
393Entry:
394 ;; In the entry block for the function, allocate the
395 ;; stack space for X, which is an LLVM pointer.
396 %X = alloca %Object*
Gordon Henriksen326e24f2007-09-27 19:31:36 +0000397
398 ;; Tell LLVM that the stack space is a stack root.
399 ;; Java has type-tags on objects, so we pass null as metadata.
400 %tmp = bitcast %Object** %X to i8**
Chris Lattner17340552008-04-24 06:00:30 +0000401 call void @llvm.gcroot(i8** %X, i8* null)
Chris Lattner0d8c2db2004-05-23 21:02:20 +0000402 ...
403
404 ;; "CodeBlock" is the block corresponding to the start
Reid Spencer03d186a2004-05-25 08:45:31 +0000405 ;; of the scope above.
Chris Lattner0d8c2db2004-05-23 21:02:20 +0000406CodeBlock:
Gordon Henriksen326e24f2007-09-27 19:31:36 +0000407 ;; Java null-initializes pointers.
408 store %Object* null, %Object** %X
409
Chris Lattner0d8c2db2004-05-23 21:02:20 +0000410 ...
411
412 ;; As the pointer goes out of scope, store a null value into
413 ;; it, to indicate that the value is no longer live.
414 store %Object* null, %Object** %X
415 ...
416</pre>
417
418</div>
419
420<!-- ======================================================================= -->
421<div class="doc_subsection">
Gordon Henriksen326e24f2007-09-27 19:31:36 +0000422 <a name="barriers">Reading and writing references in the heap</a>
Chris Lattner0d8c2db2004-05-23 21:02:20 +0000423</div>
424
425<div class="doc_text">
426
Gordon Henriksen326e24f2007-09-27 19:31:36 +0000427<p>Some collectors need to be informed when the mutator (the program that needs
428garbage collection) either reads a pointer from or writes a pointer to a field
429of a heap object. The code fragments inserted at these points are called
430<em>read barriers</em> and <em>write barriers</em>, respectively. The amount of
431code that needs to be executed is usually quite small and not on the critical
432path of any computation, so the overall performance impact of the barrier is
433tolerable.</p>
Chris Lattner0d8c2db2004-05-23 21:02:20 +0000434
Gordon Henriksen326e24f2007-09-27 19:31:36 +0000435<p>Barriers often require access to the <em>object pointer</em> rather than the
436<em>derived pointer</em> (which is a pointer to the field within the
437object). Accordingly, these intrinsics take both pointers as separate arguments
438for completeness. In this snippet, <tt>%object</tt> is the object pointer, and
439<tt>%derived</tt> is the derived pointer:</p>
440
Chris Lattner05d67092008-04-24 05:59:56 +0000441<blockquote><pre>
442 ;; An array type.
Gordon Henriksen326e24f2007-09-27 19:31:36 +0000443 %class.Array = type { %class.Object, i32, [0 x %class.Object*] }
Chris Lattner05d67092008-04-24 05:59:56 +0000444 ...
Gordon Henriksen326e24f2007-09-27 19:31:36 +0000445
446 ;; Load the object pointer from a gcroot.
447 %object = load %class.Array** %object_addr
448
449 ;; Compute the derived pointer.
Chris Lattner05d67092008-04-24 05:59:56 +0000450 %derived = getelementptr %object, i32 0, i32 2, i32 %n</pre></blockquote>
Chris Lattner0d8c2db2004-05-23 21:02:20 +0000451
452</div>
453
454<!-- ======================================================================= -->
Gordon Henriksen326e24f2007-09-27 19:31:36 +0000455<div class="doc_subsubsection">
456 <a name="gcwrite">Write barrier: <tt>llvm.gcwrite</tt></a>
Chris Lattner0d8c2db2004-05-23 21:02:20 +0000457</div>
458
Gordon Henriksen326e24f2007-09-27 19:31:36 +0000459<div class="doc_code"><tt>
460void @llvm.gcwrite(i8* %value, i8* %object, i8** %derived)
461</tt></div>
462
Chris Lattner0d8c2db2004-05-23 21:02:20 +0000463<div class="doc_text">
464
Gordon Henriksen326e24f2007-09-27 19:31:36 +0000465<p>For write barriers, LLVM provides the <tt>llvm.gcwrite</tt> intrinsic
466function. It has exactly the same semantics as a non-volatile <tt>store</tt> to
467the derived pointer (the third argument).</p>
468
469<p>Many important algorithms require write barriers, including generational
470and concurrent collectors. Additionally, write barriers could be used to
471implement reference counting.</p>
472
473<p>The use of this intrinsic is optional if the target collector does use
474write barriers. If so, the collector will replace it with the corresponding
475<tt>store</tt>.</p>
476
477</div>
478
479<!-- ======================================================================= -->
480<div class="doc_subsubsection">
481 <a name="gcread">Read barrier: <tt>llvm.gcread</tt></a>
482</div>
483
Chris Lattner0d8c2db2004-05-23 21:02:20 +0000484<div class="doc_code"><tt>
Gordon Henriksen326e24f2007-09-27 19:31:36 +0000485i8* @llvm.gcread(i8* %object, i8** %derived)<br>
Chris Lattner0d8c2db2004-05-23 21:02:20 +0000486</tt></div>
487
Gordon Henriksen326e24f2007-09-27 19:31:36 +0000488<div class="doc_text">
Chris Lattner0d8c2db2004-05-23 21:02:20 +0000489
Gordon Henriksen326e24f2007-09-27 19:31:36 +0000490<p>For read barriers, LLVM provides the <tt>llvm.gcread</tt> intrinsic function.
491It has exactly the same semantics as a non-volatile <tt>load</tt> from the
492derived pointer (the second argument).</p>
Chris Lattner0d8c2db2004-05-23 21:02:20 +0000493
Gordon Henriksen326e24f2007-09-27 19:31:36 +0000494<p>Read barriers are needed by fewer algorithms than write barriers, and may
495have a greater performance impact since pointer reads are more frequent than
496writes.</p>
497
498<p>As with <tt>llvm.gcwrite</tt>, a target collector might not require the use
499of this intrinsic.</p>
500
501</div>
502
503<!-- *********************************************************************** -->
504<div class="doc_section">
505 <a name="runtime">Recommended runtime interface</a>
506</div>
507<!-- *********************************************************************** -->
508
509<div class="doc_text">
510
511<p>LLVM specifies the following recommended runtime interface to the garbage
512collection at runtime. A program should use these interfaces to accomplish the
513tasks not supported by the intrinsics.</p>
514
515<p>Unlike the intrinsics, which are integral to LLVM's code generator, there is
516nothing unique about these interfaces; a front-end compiler and runtime are free
517to agree to a different specification.</p>
518
519<p class="doc_warning">Note: This interface is a work in progress.</p>
Chris Lattner0d8c2db2004-05-23 21:02:20 +0000520
521</div>
522
523<!-- ======================================================================= -->
524<div class="doc_subsection">
525 <a name="initialize">Garbage collector startup and initialization</a>
526</div>
527
528<div class="doc_text">
529
530<div class="doc_code"><tt>
Gordon Henriksen326e24f2007-09-27 19:31:36 +0000531 void llvm_gc_initialize(unsigned InitialHeapSize);
Chris Lattner0d8c2db2004-05-23 21:02:20 +0000532</tt></div>
533
534<p>
535The <tt>llvm_gc_initialize</tt> function should be called once before any other
Gordon Henriksen326e24f2007-09-27 19:31:36 +0000536garbage collection functions are called. This gives the garbage collector the
537chance to initialize itself and allocate the heap. The initial heap size to
538allocate should be specified as an argument.
Chris Lattner0d8c2db2004-05-23 21:02:20 +0000539</p>
540
541</div>
542
543<!-- ======================================================================= -->
544<div class="doc_subsection">
Gordon Henriksen326e24f2007-09-27 19:31:36 +0000545 <a name="allocate">Allocating memory from the GC</a>
546</div>
547
548<div class="doc_text">
549
550<div class="doc_code"><tt>
551 void *llvm_gc_allocate(unsigned Size);
552</tt></div>
553
554<p>The <tt>llvm_gc_allocate</tt> function is a global function defined by the
555garbage collector implementation to allocate memory. It returns a
556zeroed-out block of memory of the specified size, sufficiently aligned to store
557any object.</p>
558
559</div>
560
561<!-- ======================================================================= -->
562<div class="doc_subsection">
Chris Lattner0d8c2db2004-05-23 21:02:20 +0000563 <a name="explicit">Explicit invocation of the garbage collector</a>
564</div>
565
566<div class="doc_text">
567
568<div class="doc_code"><tt>
Gordon Henriksen326e24f2007-09-27 19:31:36 +0000569 void llvm_gc_collect();
Chris Lattner0d8c2db2004-05-23 21:02:20 +0000570</tt></div>
571
572<p>
573The <tt>llvm_gc_collect</tt> function is exported by the garbage collector
574implementations to provide a full collection, even when the heap is not
Gordon Henriksen326e24f2007-09-27 19:31:36 +0000575exhausted. This can be used by end-user code as a hint, and may be ignored by
Chris Lattner0d8c2db2004-05-23 21:02:20 +0000576the garbage collector.
577</p>
578
579</div>
580
Chris Lattner0d8c2db2004-05-23 21:02:20 +0000581<!-- ======================================================================= -->
582<div class="doc_subsection">
Chris Lattner9b2a1842004-05-27 05:52:10 +0000583 <a name="traceroots">Tracing GC pointers from the program stack</a>
Chris Lattner0d8c2db2004-05-23 21:02:20 +0000584</div>
585
586<div class="doc_text">
587 <div class="doc_code"><tt>
588 void llvm_cg_walk_gcroots(void (*FP)(void **Root, void *Meta));
589 </tt></div>
590
591<p>
592The <tt>llvm_cg_walk_gcroots</tt> function is a function provided by the code
593generator that iterates through all of the GC roots on the stack, calling the
Gordon Henriksen326e24f2007-09-27 19:31:36 +0000594specified function pointer with each record. For each GC root, the address of
Chris Lattner0d8c2db2004-05-23 21:02:20 +0000595the pointer and the meta-data (from the <a
Chris Lattner05d67092008-04-24 05:59:56 +0000596href="#gcroot"><tt>llvm.gcroot</tt></a> intrinsic) are provided.
Chris Lattner0d8c2db2004-05-23 21:02:20 +0000597</p>
598</div>
599
Gordon Henriksen326e24f2007-09-27 19:31:36 +0000600<!-- ======================================================================= -->
601<div class="doc_subsection">
Chris Lattner9b2a1842004-05-27 05:52:10 +0000602 <a name="staticroots">Tracing GC pointers from static roots</a>
603</div>
Chris Lattner0d8c2db2004-05-23 21:02:20 +0000604
Chris Lattner9b2a1842004-05-27 05:52:10 +0000605<div class="doc_text">
606TODO
607</div>
608
609
Gordon Henriksen326e24f2007-09-27 19:31:36 +0000610<!-- *********************************************************************** -->
611<div class="doc_section">
612 <a name="plugin">Implementing a collector plugin</a>
613</div>
614<!-- *********************************************************************** -->
615
616<div class="doc_text">
617
Gordon Henriksen01571ef2008-08-24 03:18:23 +0000618<p>User code specifies which GC code generation to use with the <tt>gc</tt>
619function attribute or, equivalently, with the <tt>setGC</tt> method of
Gordon Henriksenad93c4f2007-12-11 00:30:17 +0000620<tt>Function</tt>.</p>
621
Gordon Henriksen01571ef2008-08-24 03:18:23 +0000622<p>To implement a GC plugin, it is necessary to subclass
623<tt>llvm::GCStrategy</tt>, which can be accomplished in a few lines of
Gordon Henriksen326e24f2007-09-27 19:31:36 +0000624boilerplate code. LLVM's infrastructure provides access to several important
625algorithms. For an uncontroversial collector, all that remains may be to emit
626the assembly code for the collector's unique stack map data structure, which
627might be accomplished in as few as 100 LOC.</p>
628
Gordon Henriksen01571ef2008-08-24 03:18:23 +0000629<p>This is not the appropriate place to implement a garbage collected heap or a
630garbage collector itself. That code should exist in the language's runtime
631library. The compiler plugin is responsible for generating code which is
632compatible with that runtime library.</p>
Gordon Henriksen326e24f2007-09-27 19:31:36 +0000633
Gordon Henriksen01571ef2008-08-24 03:18:23 +0000634<p>To subclass <tt>llvm::GCStrategy</tt> and register it with the compiler:</p>
Gordon Henriksen326e24f2007-09-27 19:31:36 +0000635
Gordon Henriksen01571ef2008-08-24 03:18:23 +0000636<blockquote><pre>// lib/MyGC/MyGC.cpp - Example LLVM GC plugin
637
638#include "llvm/CodeGen/GCStrategy.h"
639#include "llvm/CodeGen/GCMetadata.h"
Gordon Henriksen326e24f2007-09-27 19:31:36 +0000640#include "llvm/Support/Compiler.h"
641
642using namespace llvm;
643
644namespace {
Gordon Henriksen01571ef2008-08-24 03:18:23 +0000645 class VISIBILITY_HIDDEN MyGC : public GCStrategy {
Gordon Henriksen326e24f2007-09-27 19:31:36 +0000646 public:
Gordon Henriksen01571ef2008-08-24 03:18:23 +0000647 MyGC() {}
Gordon Henriksen326e24f2007-09-27 19:31:36 +0000648 };
649
Gordon Henriksen01571ef2008-08-24 03:18:23 +0000650 GCRegistry::Add&lt;MyGC&gt;
Gordon Henriksenad93c4f2007-12-11 00:30:17 +0000651 X("mygc", "My bespoke garbage collector.");
Gordon Henriksen326e24f2007-09-27 19:31:36 +0000652}</pre></blockquote>
653
654<p>Using the LLVM makefiles (like the <a
655href="http://llvm.org/viewvc/llvm-project/llvm/trunk/projects/sample/">sample
656project</a>), this can be built into a plugin using a simple makefile:</p>
657
658<blockquote><pre
659># lib/MyGC/Makefile
660
661LEVEL := ../..
662LIBRARYNAME = <var>MyGC</var>
663LOADABLE_MODULE = 1
664
665include $(LEVEL)/Makefile.common</pre></blockquote>
666
Gordon Henriksenad93c4f2007-12-11 00:30:17 +0000667<p>Once the plugin is compiled, code using it may be compiled using <tt>llc
668-load=<var>MyGC.so</var></tt> (though <var>MyGC.so</var> may have some other
669platform-specific extension):</p>
Gordon Henriksen326e24f2007-09-27 19:31:36 +0000670
671<blockquote><pre
Gordon Henriksenad93c4f2007-12-11 00:30:17 +0000672>$ cat sample.ll
673define void @f() gc "mygc" {
674entry:
675 ret void
676}
677$ llvm-as &lt; sample.ll | llc -load=MyGC.so</pre></blockquote>
678
679<p>It is also possible to statically link the collector plugin into tools, such
680as a language-specific compiler front-end.</p>
Gordon Henriksen326e24f2007-09-27 19:31:36 +0000681
682</div>
683
684<!-- ======================================================================= -->
685<div class="doc_subsection">
686 <a name="collector-algos">Overview of available features</a>
687</div>
688
689<div class="doc_text">
690
691<p>The boilerplate collector above does nothing. More specifically:</p>
692
693<ul>
694 <li><tt>llvm.gcread</tt> calls are replaced with the corresponding
695 <tt>load</tt> instruction.</li>
696 <li><tt>llvm.gcwrite</tt> calls are replaced with the corresponding
697 <tt>store</tt> instruction.</li>
698 <li>No stack map is emitted, and no safe points are added.</li>
699</ul>
700
701<p><tt>Collector</tt> provides a range of features through which a plugin
702collector may do useful work. This matrix summarizes the supported (and planned)
703features and correlates them with the collection techniques which typically
704require them.</p>
705
706<table>
707 <tr>
708 <th>Algorithm</th>
709 <th>Done</th>
710 <th>shadow stack</th>
711 <th>refcount</th>
712 <th>mark-sweep</th>
713 <th>copying</th>
714 <th>incremental</th>
715 <th>threaded</th>
716 <th>concurrent</th>
717 </tr>
718 <tr>
719 <th class="rowhead"><a href="#stack-map">stack map</a></th>
720 <td>&#10004;</td>
721 <td></td>
722 <td></td>
723 <td>&#10008;</td>
724 <td>&#10008;</td>
725 <td>&#10008;</td>
726 <td>&#10008;</td>
727 <td>&#10008;</td>
728 </tr>
729 <tr>
730 <th class="rowhead"><a href="#init-roots">initialize roots</a></th>
731 <td>&#10004;</td>
732 <td>&#10008;</td>
733 <td>&#10008;</td>
734 <td>&#10008;</td>
735 <td>&#10008;</td>
736 <td>&#10008;</td>
737 <td>&#10008;</td>
738 <td>&#10008;</td>
739 </tr>
740 <tr class="doc_warning">
741 <th class="rowhead">derived pointers</th>
742 <td>NO</td>
743 <td></td>
744 <td></td>
745 <td></td>
746 <td></td>
747 <td></td>
748 <td>&#10008;*</td>
749 <td>&#10008;*</td>
750 </tr>
751 <tr>
752 <th class="rowhead"><em><a href="#custom">custom lowering</a></em></th>
753 <td>&#10004;</td>
754 <th></th>
755 <th></th>
756 <th></th>
757 <th></th>
758 <th></th>
759 <th></th>
760 <th></th>
761 </tr>
762 <tr>
763 <th class="rowhead indent">gcroot</th>
764 <td>&#10004;</td>
765 <td>&#10008;</td>
766 <td>&#10008;</td>
767 <td></td>
768 <td></td>
769 <td></td>
770 <td></td>
771 <td></td>
772 </tr>
773 <tr>
774 <th class="rowhead indent">gcwrite</th>
775 <td>&#10004;</td>
776 <td></td>
777 <td>&#10008;</td>
778 <td></td>
779 <td></td>
780 <td>&#10008;</td>
781 <td></td>
782 <td>&#10008;</td>
783 </tr>
784 <tr>
785 <th class="rowhead indent">gcread</th>
786 <td>&#10004;</td>
787 <td></td>
788 <td></td>
789 <td></td>
790 <td></td>
791 <td></td>
792 <td></td>
793 <td>&#10008;</td>
794 </tr>
795 <tr>
796 <th class="rowhead"><em><a href="#safe-points">safe points</a></em></th>
797 <td></td>
798 <th></th>
799 <th></th>
800 <th></th>
801 <th></th>
802 <th></th>
803 <th></th>
804 <th></th>
805 </tr>
806 <tr>
807 <th class="rowhead indent">in calls</th>
808 <td>&#10004;</td>
809 <td></td>
810 <td></td>
811 <td>&#10008;</td>
812 <td>&#10008;</td>
813 <td>&#10008;</td>
814 <td>&#10008;</td>
815 <td>&#10008;</td>
816 </tr>
817 <tr>
818 <th class="rowhead indent">before calls</th>
819 <td>&#10004;</td>
820 <td></td>
821 <td></td>
822 <td></td>
823 <td></td>
824 <td></td>
825 <td>&#10008;</td>
826 <td>&#10008;</td>
827 </tr>
828 <tr class="doc_warning">
829 <th class="rowhead indent">for loops</th>
830 <td>NO</td>
831 <td></td>
832 <td></td>
833 <td></td>
834 <td></td>
835 <td></td>
836 <td>&#10008;</td>
837 <td>&#10008;</td>
838 </tr>
839 <tr>
840 <th class="rowhead indent">before escape</th>
841 <td>&#10004;</td>
842 <td></td>
843 <td></td>
844 <td></td>
845 <td></td>
846 <td></td>
847 <td>&#10008;</td>
848 <td>&#10008;</td>
849 </tr>
850 <tr class="doc_warning">
851 <th class="rowhead">emit code at safe points</th>
852 <td>NO</td>
853 <td></td>
854 <td></td>
855 <td></td>
856 <td></td>
857 <td></td>
858 <td>&#10008;</td>
859 <td>&#10008;</td>
860 </tr>
861 <tr>
862 <th class="rowhead"><em>output</em></th>
863 <td></td>
864 <th></th>
865 <th></th>
866 <th></th>
867 <th></th>
868 <th></th>
869 <th></th>
870 <th></th>
871 </tr>
872 <tr>
873 <th class="rowhead indent"><a href="#assembly">assembly</a></th>
874 <td>&#10004;</td>
875 <td></td>
876 <td></td>
877 <td>&#10008;</td>
878 <td>&#10008;</td>
879 <td>&#10008;</td>
880 <td>&#10008;</td>
881 <td>&#10008;</td>
882 </tr>
883 <tr class="doc_warning">
884 <th class="rowhead indent">JIT</th>
885 <td>NO</td>
886 <td></td>
887 <td></td>
888 <td class="optl">&#10008;</td>
889 <td class="optl">&#10008;</td>
890 <td class="optl">&#10008;</td>
891 <td class="optl">&#10008;</td>
892 <td class="optl">&#10008;</td>
893 </tr>
894 <tr class="doc_warning">
895 <th class="rowhead indent">obj</th>
896 <td>NO</td>
897 <td></td>
898 <td></td>
899 <td class="optl">&#10008;</td>
900 <td class="optl">&#10008;</td>
901 <td class="optl">&#10008;</td>
902 <td class="optl">&#10008;</td>
903 <td class="optl">&#10008;</td>
904 </tr>
905 <tr class="doc_warning">
906 <th class="rowhead">live analysis</th>
907 <td>NO</td>
908 <td></td>
909 <td></td>
910 <td class="optl">&#10008;</td>
911 <td class="optl">&#10008;</td>
912 <td class="optl">&#10008;</td>
913 <td class="optl">&#10008;</td>
914 <td class="optl">&#10008;</td>
915 </tr>
916 <tr class="doc_warning">
917 <th class="rowhead">register map</th>
918 <td>NO</td>
919 <td></td>
920 <td></td>
921 <td class="optl">&#10008;</td>
922 <td class="optl">&#10008;</td>
923 <td class="optl">&#10008;</td>
924 <td class="optl">&#10008;</td>
925 <td class="optl">&#10008;</td>
926 </tr>
927 <tr>
928 <td colspan="10">
929 <div><span class="doc_warning">*</span> Derived pointers only pose a
930 hazard to copying collectors.</div>
931 <div><span class="optl">&#10008;</span> in gray denotes a feature which
932 could be utilized if available.</div>
933 </td>
934 </tr>
935</table>
936
937<p>To be clear, the collection techniques above are defined as:</p>
938
939<dl>
940 <dt>Shadow Stack</dt>
941 <dd>The mutator carefully maintains a linked list of stack root
942 descriptors.</dd>
943 <dt>Reference Counting</dt>
944 <dd>The mutator maintains a reference count for each object and frees an
945 object when its count falls to zero.</dd>
946 <dt>Mark-Sweep</dt>
947 <dd>When the heap is exhausted, the collector marks reachable objects starting
948 from the roots, then deallocates unreachable objects in a sweep
949 phase.</dd>
950 <dt>Copying</dt>
951 <dd>As reachability analysis proceeds, the collector copies objects from one
952 heap area to another, compacting them in the process. Copying collectors
953 enable highly efficient "bump pointer" allocation and can improve locality
954 of reference.</dd>
955 <dt>Incremental</dt>
956 <dd>(Including generational collectors.) Incremental collectors generally have
957 all the properties of a copying collector (regardless of whether the
958 mature heap is compacting), but bring the added complexity of requiring
959 write barriers.</dd>
960 <dt>Threaded</dt>
961 <dd>Denotes a multithreaded mutator; the collector must still stop the mutator
962 ("stop the world") before beginning reachability analysis. Stopping a
963 multithreaded mutator is a complicated problem. It generally requires
964 highly platform specific code in the runtime, and the production of
965 carefully designed machine code at safe points.</dd>
966 <dt>Concurrent</dt>
967 <dd>In this technique, the mutator and the collector run concurrently, with
968 the goal of eliminating pause times. In a <em>cooperative</em> collector,
969 the mutator further aids with collection should a pause occur, allowing
970 collection to take advantage of multiprocessor hosts. The "stop the world"
971 problem of threaded collectors is generally still present to a limited
972 extent. Sophisticated marking algorithms are necessary. Read barriers may
973 be necessary.</dd>
974</dl>
975
976<p>As the matrix indicates, LLVM's garbage collection infrastructure is already
977suitable for a wide variety of collectors, but does not currently extend to
978multithreaded programs. This will be added in the future as there is
979interest.</p>
980
981</div>
982
983<!-- ======================================================================= -->
984<div class="doc_subsection">
985 <a name="stack-map">Computing stack maps</a>
986</div>
987
988<div class="doc_text">
989
990<blockquote><pre
Gordon Henriksenad93c4f2007-12-11 00:30:17 +0000991>for (iterator I = begin(), E = end(); I != E; ++I) {
Gordon Henriksen01571ef2008-08-24 03:18:23 +0000992 GCFunctionInfo *FI = *I;
993 unsigned FrameSize = FI-&gt;getFrameSize();
994 size_t RootCount = FI-&gt;roots_size();
Gordon Henriksen326e24f2007-09-27 19:31:36 +0000995
Gordon Henriksen01571ef2008-08-24 03:18:23 +0000996 for (GCFunctionInfo::roots_iterator RI = FI-&gt;roots_begin(),
997 RE = FI-&gt;roots_end();
998 RI != RE; ++RI) {
Gordon Henriksenad93c4f2007-12-11 00:30:17 +0000999 int RootNum = RI->Num;
1000 int RootStackOffset = RI->StackOffset;
1001 Constant *RootMetadata = RI->Metadata;
1002 }
Gordon Henriksen326e24f2007-09-27 19:31:36 +00001003}</pre></blockquote>
1004
Gordon Henriksen01571ef2008-08-24 03:18:23 +00001005<p>LLVM automatically computes a stack map. All a <tt>GCStrategy</tt> needs to do
1006is access it using <tt>GCFunctionMetadata::roots_begin()</tt> and
Gordon Henriksen326e24f2007-09-27 19:31:36 +00001007-<tt>end()</tt>. If the <tt>llvm.gcroot</tt> intrinsic is eliminated before code
1008generation by a custom lowering pass, LLVM's stack map will be empty.</p>
1009
1010</div>
1011
1012
1013<!-- ======================================================================= -->
1014<div class="doc_subsection">
1015 <a name="init-roots">Initializing roots to null: <tt>InitRoots</tt></a>
1016</div>
1017
1018<div class="doc_text">
1019
1020<blockquote><pre
Gordon Henriksen01571ef2008-08-24 03:18:23 +00001021>MyGC::MyGC() {
Gordon Henriksen326e24f2007-09-27 19:31:36 +00001022 InitRoots = true;
1023}</pre></blockquote>
1024
1025<p>When set, LLVM will automatically initialize each root to <tt>null</tt> upon
Gordon Henriksen01571ef2008-08-24 03:18:23 +00001026entry to the function. This prevents the GC's sweep phase from visiting
1027uninitialized pointers, which will almost certainly cause it to crash. This
1028initialization occurs before custom lowering, so the two may be used
1029together.</p>
Gordon Henriksen326e24f2007-09-27 19:31:36 +00001030
Gordon Henriksen01571ef2008-08-24 03:18:23 +00001031<p>Since LLVM does not yet compute liveness information, there is no means of
1032distinguishing an uninitialized stack root from an initialized one. Therefore,
1033this feature should be used by all GC plugins. It is enabled by default.</p>
Gordon Henriksen326e24f2007-09-27 19:31:36 +00001034
1035</div>
1036
1037
1038<!-- ======================================================================= -->
1039<div class="doc_subsection">
1040 <a name="custom">Custom lowering of intrinsics: <tt>CustomRoots</tt>,
1041 <tt>CustomReadBarriers</tt>, and <tt>CustomWriteBarriers</tt></a>
1042</div>
1043
1044<div class="doc_text">
1045
Gordon Henriksen01571ef2008-08-24 03:18:23 +00001046<p>For GCs which use barriers or unusual treatment of stack roots, these
1047flags allow the collector to perform arbitrary transformations of the LLVM
Gordon Henriksen326e24f2007-09-27 19:31:36 +00001048IR:</p>
1049
1050<blockquote><pre
Gordon Henriksen01571ef2008-08-24 03:18:23 +00001051>class MyGC : public GCStrategy {
Gordon Henriksen326e24f2007-09-27 19:31:36 +00001052public:
Gordon Henriksen01571ef2008-08-24 03:18:23 +00001053 MyGC() {
Gordon Henriksen326e24f2007-09-27 19:31:36 +00001054 CustomRoots = true;
1055 CustomReadBarriers = true;
1056 CustomWriteBarriers = true;
1057 }
1058
Gordon Henriksenad93c4f2007-12-11 00:30:17 +00001059 virtual bool initializeCustomLowering(Module &amp;M);
1060 virtual bool performCustomLowering(Function &amp;F);
Gordon Henriksen326e24f2007-09-27 19:31:36 +00001061};</pre></blockquote>
1062
1063<p>If any of these flags are set, then LLVM suppresses its default lowering for
Gordon Henriksen01571ef2008-08-24 03:18:23 +00001064the corresponding intrinsics and instead calls
1065<tt>performCustomLowering</tt>.</p>
Gordon Henriksen326e24f2007-09-27 19:31:36 +00001066
1067<p>LLVM's default action for each intrinsic is as follows:</p>
1068
1069<ul>
1070 <li><tt>llvm.gcroot</tt>: Pass through to the code generator to generate a
1071 stack map.</li>
1072 <li><tt>llvm.gcread</tt>: Substitute a <tt>load</tt> instruction.</li>
1073 <li><tt>llvm.gcwrite</tt>: Substitute a <tt>store</tt> instruction.</li>
1074</ul>
1075
1076<p>If <tt>CustomReadBarriers</tt> or <tt>CustomWriteBarriers</tt> are specified,
Gordon Henriksenad93c4f2007-12-11 00:30:17 +00001077then <tt>performCustomLowering</tt> <strong>must</strong> eliminate the
1078corresponding barriers.</p>
Gordon Henriksen326e24f2007-09-27 19:31:36 +00001079
Gordon Henriksen01571ef2008-08-24 03:18:23 +00001080<p><tt>performCustomLowering</tt> must comply with the same restrictions as <a
1081href="WritingAnLLVMPass.html#runOnFunction"><tt
1082>FunctionPass::runOnFunction</tt></a>.
1083Likewise, <tt>initializeCustomLowering</tt> has the same semantics as <a
1084href="WritingAnLLVMPass.html#doInitialization_mod"><tt
1085>Pass::doInitialization(Module&amp;)</tt></a>.</p>
Gordon Henriksenad93c4f2007-12-11 00:30:17 +00001086
1087<p>The following can be used as a template:</p>
Gordon Henriksen326e24f2007-09-27 19:31:36 +00001088
1089<blockquote><pre
Gordon Henriksenad93c4f2007-12-11 00:30:17 +00001090>#include "llvm/Module.h"
Gordon Henriksen0adede02007-12-22 23:32:32 +00001091#include "llvm/IntrinsicInst.h"
Gordon Henriksen326e24f2007-09-27 19:31:36 +00001092
Gordon Henriksen01571ef2008-08-24 03:18:23 +00001093bool MyGC::initializeCustomLowering(Module &amp;M) {
Gordon Henriksenad93c4f2007-12-11 00:30:17 +00001094 return false;
1095}
Gordon Henriksen326e24f2007-09-27 19:31:36 +00001096
Gordon Henriksen01571ef2008-08-24 03:18:23 +00001097bool MyGC::performCustomLowering(Function &amp;F) {
Gordon Henriksenad93c4f2007-12-11 00:30:17 +00001098 bool MadeChange = false;
1099
1100 for (Function::iterator BB = F.begin(), E = F.end(); BB != E; ++BB)
Gordon Henriksen74f4ded2007-12-22 23:34:26 +00001101 for (BasicBlock::iterator II = BB-&gt;begin(), E = BB-&gt;end(); II != E; )
1102 if (IntrinsicInst *CI = dyn_cast&lt;IntrinsicInst&gt;(II++))
Gordon Henriksen0adede02007-12-22 23:32:32 +00001103 if (Function *F = CI-&gt;getCalledFunction())
1104 switch (F-&gt;getIntrinsicID()) {
1105 case Intrinsic::gcwrite:
Gordon Henriksenad93c4f2007-12-11 00:30:17 +00001106 // Handle llvm.gcwrite.
Gordon Henriksen0adede02007-12-22 23:32:32 +00001107 CI-&gt;eraseFromParent();
Gordon Henriksenad93c4f2007-12-11 00:30:17 +00001108 MadeChange = true;
Gordon Henriksen0adede02007-12-22 23:32:32 +00001109 break;
1110 case Intrinsic::gcread:
Gordon Henriksenad93c4f2007-12-11 00:30:17 +00001111 // Handle llvm.gcread.
Gordon Henriksen0adede02007-12-22 23:32:32 +00001112 CI-&gt;eraseFromParent();
Gordon Henriksenad93c4f2007-12-11 00:30:17 +00001113 MadeChange = true;
Gordon Henriksen0adede02007-12-22 23:32:32 +00001114 break;
1115 case Intrinsic::gcroot:
Gordon Henriksenad93c4f2007-12-11 00:30:17 +00001116 // Handle llvm.gcroot.
Gordon Henriksen0adede02007-12-22 23:32:32 +00001117 CI-&gt;eraseFromParent();
Gordon Henriksenad93c4f2007-12-11 00:30:17 +00001118 MadeChange = true;
Gordon Henriksen0adede02007-12-22 23:32:32 +00001119 break;
Gordon Henriksenad93c4f2007-12-11 00:30:17 +00001120 }
1121
1122 return MadeChange;
Gordon Henriksen326e24f2007-09-27 19:31:36 +00001123}</pre></blockquote>
1124
1125</div>
1126
1127
1128<!-- ======================================================================= -->
1129<div class="doc_subsection">
1130 <a name="safe-points">Generating safe points: <tt>NeededSafePoints</tt></a>
1131</div>
1132
1133<div class="doc_text">
1134
1135<p>LLVM can compute four kinds of safe points:</p>
1136
1137<blockquote><pre
1138>namespace GC {
1139 /// PointKind - The type of a collector-safe point.
1140 ///
1141 enum PointKind {
1142 Loop, //&lt; Instr is a loop (backwards branch).
1143 Return, //&lt; Instr is a return instruction.
1144 PreCall, //&lt; Instr is a call instruction.
1145 PostCall //&lt; Instr is the return address of a call.
1146 };
1147}</pre></blockquote>
1148
1149<p>A collector can request any combination of the four by setting the
1150<tt>NeededSafePoints</tt> mask:</p>
1151
1152<blockquote><pre
Gordon Henriksen01571ef2008-08-24 03:18:23 +00001153>MyGC::MyGC() {
Gordon Henriksen326e24f2007-09-27 19:31:36 +00001154 NeededSafePoints = 1 &lt;&lt; GC::Loop
1155 | 1 &lt;&lt; GC::Return
1156 | 1 &lt;&lt; GC::PreCall
1157 | 1 &lt;&lt; GC::PostCall;
1158}</pre></blockquote>
1159
1160<p>It can then use the following routines to access safe points.</p>
1161
Gordon Henriksenad93c4f2007-12-11 00:30:17 +00001162<blockquote><pre
1163>for (iterator I = begin(), E = end(); I != E; ++I) {
Gordon Henriksen01571ef2008-08-24 03:18:23 +00001164 GCFunctionInfo *MD = *I;
Gordon Henriksenad93c4f2007-12-11 00:30:17 +00001165 size_t PointCount = MD-&gt;size();
Gordon Henriksen326e24f2007-09-27 19:31:36 +00001166
Gordon Henriksen01571ef2008-08-24 03:18:23 +00001167 for (GCFunctionInfo::iterator PI = MD-&gt;begin(),
1168 PE = MD-&gt;end(); PI != PE; ++PI) {
Gordon Henriksenad93c4f2007-12-11 00:30:17 +00001169 GC::PointKind PointKind = PI-&gt;Kind;
1170 unsigned PointNum = PI-&gt;Num;
1171 }
1172}
1173</pre></blockquote>
Gordon Henriksen326e24f2007-09-27 19:31:36 +00001174
1175<p>Almost every collector requires <tt>PostCall</tt> safe points, since these
1176correspond to the moments when the function is suspended during a call to a
1177subroutine.</p>
1178
1179<p>Threaded programs generally require <tt>Loop</tt> safe points to guarantee
1180that the application will reach a safe point within a bounded amount of time,
1181even if it is executing a long-running loop which contains no function
1182calls.</p>
1183
1184<p>Threaded collectors may also require <tt>Return</tt> and <tt>PreCall</tt>
1185safe points to implement "stop the world" techniques using self-modifying code,
1186where it is important that the program not exit the function without reaching a
1187safe point (because only the topmost function has been patched).</p>
1188
1189</div>
1190
1191
1192<!-- ======================================================================= -->
1193<div class="doc_subsection">
Gordon Henriksen01571ef2008-08-24 03:18:23 +00001194 <a name="assembly">Emitting assembly code: <tt>GCMetadataPrinter</tt></a>
Gordon Henriksen326e24f2007-09-27 19:31:36 +00001195</div>
1196
1197<div class="doc_text">
1198
Gordon Henriksen01571ef2008-08-24 03:18:23 +00001199<p>LLVM allows a GC to print arbitrary assembly code before and after the rest
1200of a module's assembly code. At the end of the module, the GC can print stack
1201maps built by the code generator. (At the beginning, this information is not
1202yet computed.)</p>
1203
1204<p>Since AsmWriter and CodeGen are separate components of LLVM, a separate
1205abstract base class and registry is provided for printing assembly code, the
1206<tt>GCMetadaPrinter</tt> and <tt>GCMetadaPrinterRegistry</tt>. The AsmWriter
1207will look for such a subclass if the <tt>GCStrategy</tt> sets
1208<tt>UsesMetadata</tt>:</p>
1209
1210<blockquote><pre
1211>MyGC::MyGC() {
1212 UsesMetadata = true;
1213}</pre></blockquote>
Gordon Henriksen326e24f2007-09-27 19:31:36 +00001214
Gordon Henriksenad93c4f2007-12-11 00:30:17 +00001215<p>Note that LLVM does not currently have analogous APIs to support code
1216generation in the JIT, nor using the object writers.</p>
Gordon Henriksen326e24f2007-09-27 19:31:36 +00001217
1218<blockquote><pre
Gordon Henriksen01571ef2008-08-24 03:18:23 +00001219>// lib/MyGC/MyGCPrinter.cpp - Example LLVM GC printer
Gordon Henriksen326e24f2007-09-27 19:31:36 +00001220
Gordon Henriksen01571ef2008-08-24 03:18:23 +00001221#include "llvm/CodeGen/GCMetadataPrinter.h"
1222#include "llvm/Support/Compiler.h"
1223
1224using namespace llvm;
1225
1226namespace {
1227 class VISIBILITY_HIDDEN MyGCPrinter : public GCMetadataPrinter {
1228 public:
1229 virtual void beginAssembly(std::ostream &amp;OS, AsmPrinter &amp;AP,
1230 const TargetAsmInfo &amp;TAI);
1231
1232 virtual void finishAssembly(std::ostream &amp;OS, AsmPrinter &amp;AP,
1233 const TargetAsmInfo &amp;TAI);
1234 };
1235
1236 GCMetadataPrinterRegistry::Add&lt;MyGCPrinter&gt;
1237 X("mygc", "My bespoke garbage collector.");
Gordon Henriksen326e24f2007-09-27 19:31:36 +00001238}</pre></blockquote>
1239
1240<p>The collector should use <tt>AsmPrinter</tt> and <tt>TargetAsmInfo</tt> to
Gordon Henriksenad93c4f2007-12-11 00:30:17 +00001241print portable assembly code to the <tt>std::ostream</tt>. The collector itself
1242contains the stack map for the entire module, and may access the
Gordon Henriksen01571ef2008-08-24 03:18:23 +00001243<tt>GCFunctionInfo</tt> using its own <tt>begin()</tt> and <tt>end()</tt>
Gordon Henriksenad93c4f2007-12-11 00:30:17 +00001244methods. Here's a realistic example:</p>
Gordon Henriksen326e24f2007-09-27 19:31:36 +00001245
1246<blockquote><pre
1247>#include "llvm/CodeGen/AsmPrinter.h"
1248#include "llvm/Function.h"
Gordon Henriksenad93c4f2007-12-11 00:30:17 +00001249#include "llvm/Target/TargetMachine.h"
1250#include "llvm/Target/TargetData.h"
Gordon Henriksen326e24f2007-09-27 19:31:36 +00001251#include "llvm/Target/TargetAsmInfo.h"
1252
Gordon Henriksen01571ef2008-08-24 03:18:23 +00001253void MyGCPrinter::beginAssembly(std::ostream &amp;OS, AsmPrinter &amp;AP,
Gordon Henriksenad93c4f2007-12-11 00:30:17 +00001254 const TargetAsmInfo &amp;TAI) {
1255 // Nothing to do.
1256}
1257
Gordon Henriksen01571ef2008-08-24 03:18:23 +00001258void MyGCPrinter::finishAssembly(std::ostream &amp;OS, AsmPrinter &amp;AP,
Gordon Henriksenad93c4f2007-12-11 00:30:17 +00001259 const TargetAsmInfo &amp;TAI) {
Gordon Henriksen326e24f2007-09-27 19:31:36 +00001260 // Set up for emitting addresses.
1261 const char *AddressDirective;
1262 int AddressAlignLog;
Gordon Henriksenad93c4f2007-12-11 00:30:17 +00001263 if (AP.TM.getTargetData()->getPointerSize() == sizeof(int32_t)) {
Gordon Henriksen326e24f2007-09-27 19:31:36 +00001264 AddressDirective = TAI.getData32bitsDirective();
1265 AddressAlignLog = 2;
1266 } else {
1267 AddressDirective = TAI.getData64bitsDirective();
1268 AddressAlignLog = 3;
1269 }
1270
1271 // Put this in the data section.
1272 AP.SwitchToDataSection(TAI.getDataSection());
1273
1274 // For each function...
Gordon Henriksenad93c4f2007-12-11 00:30:17 +00001275 for (iterator FI = begin(), FE = end(); FI != FE; ++FI) {
Gordon Henriksen01571ef2008-08-24 03:18:23 +00001276 GCFunctionInfo &amp;MD = **FI;
Gordon Henriksen326e24f2007-09-27 19:31:36 +00001277
1278 // Emit this data structure:
1279 //
1280 // struct {
1281 // int32_t PointCount;
1282 // struct {
1283 // void *SafePointAddress;
1284 // int32_t LiveCount;
1285 // int32_t LiveOffsets[LiveCount];
1286 // } Points[PointCount];
1287 // } __gcmap_&lt;FUNCTIONNAME&gt;;
1288
1289 // Align to address width.
1290 AP.EmitAlignment(AddressAlignLog);
1291
1292 // Emit the symbol by which the stack map can be found.
1293 std::string Symbol;
1294 Symbol += TAI.getGlobalPrefix();
1295 Symbol += "__gcmap_";
1296 Symbol += MD.getFunction().getName();
1297 if (const char *GlobalDirective = TAI.getGlobalDirective())
1298 OS &lt;&lt; GlobalDirective &lt;&lt; Symbol &lt;&lt; "\n";
1299 OS &lt;&lt; TAI.getGlobalPrefix() &lt;&lt; Symbol &lt;&lt; ":\n";
1300
1301 // Emit PointCount.
1302 AP.EmitInt32(MD.size());
1303 AP.EOL("safe point count");
1304
1305 // And each safe point...
Gordon Henriksen01571ef2008-08-24 03:18:23 +00001306 for (GCFunctionInfo::iterator PI = MD.begin(),
Gordon Henriksen326e24f2007-09-27 19:31:36 +00001307 PE = MD.end(); PI != PE; ++PI) {
1308 // Align to address width.
1309 AP.EmitAlignment(AddressAlignLog);
1310
1311 // Emit the address of the safe point.
1312 OS &lt;&lt; AddressDirective
1313 &lt;&lt; TAI.getPrivateGlobalPrefix() &lt;&lt; "label" &lt;&lt; PI-&gt;Num;
1314 AP.EOL("safe point address");
1315
1316 // Emit the stack frame size.
1317 AP.EmitInt32(MD.getFrameSize());
1318 AP.EOL("stack frame size");
1319
1320 // Emit the number of live roots in the function.
1321 AP.EmitInt32(MD.live_size(PI));
1322 AP.EOL("live root count");
1323
1324 // And for each live root...
Gordon Henriksen01571ef2008-08-24 03:18:23 +00001325 for (GCFunctionInfo::live_iterator LI = MD.live_begin(PI),
Gordon Henriksen326e24f2007-09-27 19:31:36 +00001326 LE = MD.live_end(PI);
1327 LI != LE; ++LI) {
1328 // Print its offset within the stack frame.
1329 AP.EmitInt32(LI-&gt;StackOffset);
1330 AP.EOL("stack offset");
1331 }
1332 }
1333 }
1334}
1335</pre></blockquote>
1336
1337</div>
1338
1339
1340<!-- *********************************************************************** -->
1341<div class="doc_section">
1342 <a name="runtime-impl">Implementing a collector runtime</a>
1343</div>
1344<!-- *********************************************************************** -->
1345
1346<div class="doc_text">
1347
1348<p>Implementing a garbage collector for LLVM is fairly straightforward. The
1349LLVM garbage collectors are provided in a form that makes them easy to link into
1350the language-specific runtime that a language front-end would use. They require
1351functionality from the language-specific runtime to get information about <a
1352href="#gcdescriptors">where pointers are located in heap objects</a>.</p>
1353
1354<p>The implementation must include the
1355<a href="#allocate"><tt>llvm_gc_allocate</tt></a> and
1356<a href="#explicit"><tt>llvm_gc_collect</tt></a> functions. To do this, it will
1357probably have to <a href="#traceroots">trace through the roots
1358from the stack</a> and understand the <a href="#gcdescriptors">GC descriptors
Chris Lattner05d67092008-04-24 05:59:56 +00001359for heap objects</a>. Luckily, there are some <a href="#usage">example
Gordon Henriksen326e24f2007-09-27 19:31:36 +00001360implementations</a> available.
1361</p>
1362</div>
1363
1364
1365<!-- ======================================================================= -->
1366<div class="doc_subsection">
Chris Lattner9b2a1842004-05-27 05:52:10 +00001367 <a name="gcdescriptors">Tracing GC pointers from heap objects</a>
1368</div>
1369
1370<div class="doc_text">
1371<p>
1372The three most common ways to keep track of where pointers live in heap objects
1373are (listed in order of space overhead required):</p>
1374
1375<ol>
1376<li>In languages with polymorphic objects, pointers from an object header are
Gordon Henriksen326e24f2007-09-27 19:31:36 +00001377usually used to identify the GC pointers in the heap object. This is common for
Chris Lattner9b2a1842004-05-27 05:52:10 +00001378object-oriented languages like Self, Smalltalk, Java, or C#.</li>
1379
1380<li>If heap objects are not polymorphic, often the "shape" of the heap can be
1381determined from the roots of the heap or from some other meta-data [<a
1382href="#appel89">Appel89</a>, <a href="#goldberg91">Goldberg91</a>, <a
Gordon Henriksen326e24f2007-09-27 19:31:36 +00001383href="#tolmach94">Tolmach94</a>]. In this case, the garbage collector can
1384propagate the information around from meta data stored with the roots. This
1385often eliminates the need to have a header on objects in the heap. This is
Chris Lattner9b2a1842004-05-27 05:52:10 +00001386common in the ML family.</li>
1387
1388<li>If all heap objects have pointers in the same locations, or pointers can be
1389distinguished just by looking at them (e.g., the low order bit is clear), no
Gordon Henriksen326e24f2007-09-27 19:31:36 +00001390book-keeping is needed at all. This is common for Lisp-like languages.</li>
Chris Lattner9b2a1842004-05-27 05:52:10 +00001391</ol>
1392
1393<p>The LLVM garbage collectors are capable of supporting all of these styles of
Gordon Henriksen326e24f2007-09-27 19:31:36 +00001394language, including ones that mix various implementations. To do this, it
Chris Lattner9b2a1842004-05-27 05:52:10 +00001395allows the source-language to associate meta-data with the <a
Chris Lattner05d67092008-04-24 05:59:56 +00001396href="#gcroot">stack roots</a>, and the heap tracing routines can propagate the
Gordon Henriksen326e24f2007-09-27 19:31:36 +00001397information. In addition, LLVM allows the front-end to extract GC information
1398in any form from a specific object pointer (this supports situations #1 and #3).
Chris Lattner0d8c2db2004-05-23 21:02:20 +00001399</p>
1400
1401</div>
1402
Chris Lattner9b2a1842004-05-27 05:52:10 +00001403
1404<!-- *********************************************************************** -->
1405<div class="doc_section">
1406 <a name="references">References</a>
1407</div>
1408<!-- *********************************************************************** -->
1409
1410<div class="doc_text">
1411
1412<p><a name="appel89">[Appel89]</a> Runtime Tags Aren't Necessary. Andrew
1413W. Appel. Lisp and Symbolic Computation 19(7):703-705, July 1989.</p>
1414
1415<p><a name="goldberg91">[Goldberg91]</a> Tag-free garbage collection for
Gordon Henriksen326e24f2007-09-27 19:31:36 +00001416strongly typed programming languages. Benjamin Goldberg. ACM SIGPLAN
Chris Lattner9b2a1842004-05-27 05:52:10 +00001417PLDI'91.</p>
1418
1419<p><a name="tolmach94">[Tolmach94]</a> Tag-free garbage collection using
Gordon Henriksen326e24f2007-09-27 19:31:36 +00001420explicit type parameters. Andrew Tolmach. Proceedings of the 1994 ACM
Chris Lattner9b2a1842004-05-27 05:52:10 +00001421conference on LISP and functional programming.</p>
1422
Gordon Henriksen326e24f2007-09-27 19:31:36 +00001423<p><a name="henderson02">[Henderson2002]</a> <a
1424href="http://citeseer.ist.psu.edu/henderson02accurate.html">
1425Accurate Garbage Collection in an Uncooperative Environment</a>.
1426Fergus Henderson. International Symposium on Memory Management 2002.</p>
1427
Chris Lattner9b2a1842004-05-27 05:52:10 +00001428</div>
Chris Lattner0d8c2db2004-05-23 21:02:20 +00001429
Gordon Henriksen326e24f2007-09-27 19:31:36 +00001430
Chris Lattner0d8c2db2004-05-23 21:02:20 +00001431<!-- *********************************************************************** -->
1432
1433<hr>
1434<address>
1435 <a href="http://jigsaw.w3.org/css-validator/check/referer"><img
1436 src="http://jigsaw.w3.org/css-validator/images/vcss" alt="Valid CSS!"></a>
1437 <a href="http://validator.w3.org/check/referer"><img
1438 src="http://www.w3.org/Icons/valid-html401" alt="Valid HTML 4.01!"></a>
1439
1440 <a href="mailto:sabre@nondot.org">Chris Lattner</a><br>
Reid Spencer05fe4b02006-03-14 05:39:39 +00001441 <a href="http://llvm.org">LLVM Compiler Infrastructure</a><br>
Chris Lattner0d8c2db2004-05-23 21:02:20 +00001442 Last modified: $Date$
1443</address>
1444
1445</body>
1446</html>