blob: c2236d4d81a4e29910f3116946c71de438cac84d [file] [log] [blame]
Chris Lattner0d8c2db2004-05-23 21:02:20 +00001<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01//EN"
2 "http://www.w3.org/TR/html4/strict.dtd">
3<html>
4<head>
Gordon Henriksen326e24f2007-09-27 19:31:36 +00005 <meta http-equiv="Content-Type" Content="text/html; charset=UTF-8" >
Chris Lattner0d8c2db2004-05-23 21:02:20 +00006 <title>Accurate Garbage Collection with LLVM</title>
7 <link rel="stylesheet" href="llvm.css" type="text/css">
Gordon Henriksen326e24f2007-09-27 19:31:36 +00008 <style type="text/css">
9 .rowhead { text-align: left; background: inherit; }
10 .indent { padding-left: 1em; }
11 .optl { color: #BFBFBF; }
12 </style>
Chris Lattner0d8c2db2004-05-23 21:02:20 +000013</head>
14<body>
15
16<div class="doc_title">
17 Accurate Garbage Collection with LLVM
18</div>
19
20<ol>
21 <li><a href="#introduction">Introduction</a>
22 <ul>
Gordon Henriksenc9c0b592009-03-02 03:47:20 +000023 <li><a href="#feature">Goals and non-goals</a></li>
Chris Lattner0d8c2db2004-05-23 21:02:20 +000024 </ul>
25 </li>
26
Gordon Henriksenc9c0b592009-03-02 03:47:20 +000027 <li><a href="#quickstart">Getting started</a>
Chris Lattner0d8c2db2004-05-23 21:02:20 +000028 <ul>
Gordon Henriksenc9c0b592009-03-02 03:47:20 +000029 <li><a href="quickstart-compiler">In your compiler</a></li>
30 <li><a href="quickstart-runtime">In your runtime library</a></li>
31 <li><a href="shadow-stack">About the shadow stack</a></li>
Gordon Henriksen326e24f2007-09-27 19:31:36 +000032 </ul>
33 </li>
34
Gordon Henriksenad93c4f2007-12-11 00:30:17 +000035 <li><a href="#core">Core support</a>
Gordon Henriksen326e24f2007-09-27 19:31:36 +000036 <ul>
Gordon Henriksenad93c4f2007-12-11 00:30:17 +000037 <li><a href="#gcattr">Specifying GC code generation:
38 <tt>gc "..."</tt></a></li>
Gordon Henriksen326e24f2007-09-27 19:31:36 +000039 <li><a href="#gcroot">Identifying GC roots on the stack:
40 <tt>llvm.gcroot</tt></a></li>
41 <li><a href="#barriers">Reading and writing references in the heap</a>
42 <ul>
43 <li><a href="#gcwrite">Write barrier: <tt>llvm.gcwrite</tt></a></li>
44 <li><a href="#gcread">Read barrier: <tt>llvm.gcread</tt></a></li>
45 </ul>
46 </li>
47 </ul>
48 </li>
49
Gordon Henriksenc9c0b592009-03-02 03:47:20 +000050 <li><a href="#plugin">Compiler plugin interface</a>
Chris Lattner0d8c2db2004-05-23 21:02:20 +000051 <ul>
Gordon Henriksen326e24f2007-09-27 19:31:36 +000052 <li><a href="#collector-algos">Overview of available features</a></li>
53 <li><a href="#stack-map">Computing stack maps</a></li>
54 <li><a href="#init-roots">Initializing roots to null:
55 <tt>InitRoots</tt></a></li>
56 <li><a href="#custom">Custom lowering of intrinsics: <tt>CustomRoots</tt>,
57 <tt>CustomReadBarriers</tt>, and <tt>CustomWriteBarriers</tt></a></li>
58 <li><a href="#safe-points">Generating safe points:
59 <tt>NeededSafePoints</tt></a></li>
60 <li><a href="#assembly">Emitting assembly code:
Gordon Henriksen01571ef2008-08-24 03:18:23 +000061 <tt>GCMetadataPrinter</tt></a></li>
Chris Lattner0b02dbc2004-07-09 05:03:54 +000062 </ul>
Chris Lattner9b2a1842004-05-27 05:52:10 +000063 </li>
Chris Lattner0d8c2db2004-05-23 21:02:20 +000064
Gordon Henriksen326e24f2007-09-27 19:31:36 +000065 <li><a href="#runtime-impl">Implementing a collector runtime</a>
66 <ul>
67 <li><a href="#gcdescriptors">Tracing GC pointers from heap
68 objects</a></li>
69 </ul>
70 </li>
71
72 <li><a href="#references">References</a></li>
73
Chris Lattner0d8c2db2004-05-23 21:02:20 +000074</ol>
75
76<div class="doc_author">
Gordon Henriksen326e24f2007-09-27 19:31:36 +000077 <p>Written by <a href="mailto:sabre@nondot.org">Chris Lattner</a> and
78 Gordon Henriksen</p>
Chris Lattner0d8c2db2004-05-23 21:02:20 +000079</div>
80
81<!-- *********************************************************************** -->
82<div class="doc_section">
83 <a name="introduction">Introduction</a>
84</div>
85<!-- *********************************************************************** -->
86
87<div class="doc_text">
88
89<p>Garbage collection is a widely used technique that frees the programmer from
Gordon Henriksen326e24f2007-09-27 19:31:36 +000090having to know the lifetimes of heap objects, making software easier to produce
91and maintain. Many programming languages rely on garbage collection for
92automatic memory management. There are two primary forms of garbage collection:
Chris Lattner0d8c2db2004-05-23 21:02:20 +000093conservative and accurate.</p>
94
95<p>Conservative garbage collection often does not require any special support
96from either the language or the compiler: it can handle non-type-safe
97programming languages (such as C/C++) and does not require any special
Gordon Henriksen326e24f2007-09-27 19:31:36 +000098information from the compiler. The
Jeff Cohen65fc36b2007-04-18 17:26:14 +000099<a href="http://www.hpl.hp.com/personal/Hans_Boehm/gc/">Boehm collector</a> is
100an example of a state-of-the-art conservative collector.</p>
Chris Lattner0d8c2db2004-05-23 21:02:20 +0000101
102<p>Accurate garbage collection requires the ability to identify all pointers in
103the program at run-time (which requires that the source-language be type-safe in
Gordon Henriksen326e24f2007-09-27 19:31:36 +0000104most cases). Identifying pointers at run-time requires compiler support to
Chris Lattner0d8c2db2004-05-23 21:02:20 +0000105locate all places that hold live pointer variables at run-time, including the
Gordon Henriksen326e24f2007-09-27 19:31:36 +0000106<a href="#gcroot">processor stack and registers</a>.</p>
Chris Lattner0d8c2db2004-05-23 21:02:20 +0000107
Gordon Henriksen326e24f2007-09-27 19:31:36 +0000108<p>Conservative garbage collection is attractive because it does not require any
109special compiler support, but it does have problems. In particular, because the
Chris Lattner0d8c2db2004-05-23 21:02:20 +0000110conservative garbage collector cannot <i>know</i> that a particular word in the
111machine is a pointer, it cannot move live objects in the heap (preventing the
112use of compacting and generational GC algorithms) and it can occasionally suffer
113from memory leaks due to integer values that happen to point to objects in the
Gordon Henriksen326e24f2007-09-27 19:31:36 +0000114program. In addition, some aggressive compiler transformations can break
115conservative garbage collectors (though these seem rare in practice).</p>
Chris Lattner0d8c2db2004-05-23 21:02:20 +0000116
Gordon Henriksen326e24f2007-09-27 19:31:36 +0000117<p>Accurate garbage collectors do not suffer from any of these problems, but
118they can suffer from degraded scalar optimization of the program. In particular,
Chris Lattner0d8c2db2004-05-23 21:02:20 +0000119because the runtime must be able to identify and update all pointers active in
Gordon Henriksen326e24f2007-09-27 19:31:36 +0000120the program, some optimizations are less effective. In practice, however, the
Chris Lattner0d8c2db2004-05-23 21:02:20 +0000121locality and performance benefits of using aggressive garbage allocation
Gordon Henriksen326e24f2007-09-27 19:31:36 +0000122techniques dominates any low-level losses.</p>
Chris Lattner0d8c2db2004-05-23 21:02:20 +0000123
Gordon Henriksen326e24f2007-09-27 19:31:36 +0000124<p>This document describes the mechanisms and interfaces provided by LLVM to
125support accurate garbage collection.</p>
Chris Lattner0d8c2db2004-05-23 21:02:20 +0000126
127</div>
128
129<!-- ======================================================================= -->
130<div class="doc_subsection">
Gordon Henriksenc9c0b592009-03-02 03:47:20 +0000131 <a name="feature">Goals and non-goals</a>
Chris Lattner0d8c2db2004-05-23 21:02:20 +0000132</div>
133
134<div class="doc_text">
135
Gordon Henriksen326e24f2007-09-27 19:31:36 +0000136<p>LLVM's intermediate representation provides <a href="#intrinsics">garbage
Chris Lattner05d67092008-04-24 05:59:56 +0000137collection intrinsics</a> that offer support for a broad class of
Gordon Henriksen326e24f2007-09-27 19:31:36 +0000138collector models. For instance, the intrinsics permit:</p>
Chris Lattner0d8c2db2004-05-23 21:02:20 +0000139
Gordon Henriksen326e24f2007-09-27 19:31:36 +0000140<ul>
141 <li>semi-space collectors</li>
142 <li>mark-sweep collectors</li>
143 <li>generational collectors</li>
144 <li>reference counting</li>
145 <li>incremental collectors</li>
146 <li>concurrent collectors</li>
147 <li>cooperative collectors</li>
148</ul>
Chris Lattner0d8c2db2004-05-23 21:02:20 +0000149
Gordon Henriksen326e24f2007-09-27 19:31:36 +0000150<p>We hope that the primitive support built into the LLVM IR is sufficient to
151support a broad class of garbage collected languages including Scheme, ML, Java,
152C#, Perl, Python, Lua, Ruby, other scripting languages, and more.</p>
153
Gordon Henriksenc9c0b592009-03-02 03:47:20 +0000154<p>However, LLVM does not itself provide a garbage collector&#151;this should
155be part of your language's runtime library. LLVM provides a framework for
156compile time <a href="#plugin">code generation plugins</a>. The role of these
157plugins is to generate code and data structures which conforms to the <em>binary
158interface</em> specified by the <em>runtime library</em>. This is similar to the
159relationship between LLVM and DWARF debugging info, for example. The
160difference primarily lies in the lack of an established standard in the domain
161of garbage collection&#151;thus the plugins.</p>
162
163<p>The aspects of the binary interface with which LLVM's GC support is
164concerned are:</p>
Gordon Henriksen326e24f2007-09-27 19:31:36 +0000165
166<ul>
Gordon Henriksenc9c0b592009-03-02 03:47:20 +0000167 <li>Creation of GC-safe points within code where collection is allowed to
168 execute safely.</li>
169 <li>Definition of a stack frame descriptor. For each safe point in the code,
170 a frame descriptor maps where object references are located within the
171 frame so that the GC may traverse and perhaps update them.</li>
172 <li>Write barriers when storing object references within the heap. These
173 are commonly used to optimize incremental scans.</li>
174 <li>Emission of read barriers when loading object references. These are
175 useful for interoperating with concurrent collectors.</li>
Gordon Henriksen326e24f2007-09-27 19:31:36 +0000176</ul>
Chris Lattner0d8c2db2004-05-23 21:02:20 +0000177
Gordon Henriksenc9c0b592009-03-02 03:47:20 +0000178<p>There are additional areas that LLVM does not directly address:</p>
179
180<ul>
181 <li>Registration of global roots.</li>
182 <li>Discovery or registration of stack frame descriptors.</li>
183 <li>The functions used by the program to allocate memory, trigger a
184 collection, etc.</li>
185</ul>
186
187<p>In general, LLVM's support for GC does not include features which can be
188adequately addressed with other features of the IR and does not specify a
189particular binary interface. On the plus side, this means that you should be
190able to integrate LLVM with an existing runtime. On the other hand, it leaves
191a lot of work for the developer of a novel language. However, it's easy to get
192started quickly and scale up to a more sophisticated implementation as your
193compiler matures.</p>
194
Chris Lattner0d8c2db2004-05-23 21:02:20 +0000195</div>
196
197<!-- *********************************************************************** -->
198<div class="doc_section">
Gordon Henriksenc9c0b592009-03-02 03:47:20 +0000199 <a name="quickstart">Getting started</a>
Chris Lattner0d8c2db2004-05-23 21:02:20 +0000200</div>
201<!-- *********************************************************************** -->
202
203<div class="doc_text">
204
Gordon Henriksenc9c0b592009-03-02 03:47:20 +0000205<p>Using a GC with LLVM implies many things, for example:</p>
Gordon Henriksen326e24f2007-09-27 19:31:36 +0000206
207<ul>
Gordon Henriksenc9c0b592009-03-02 03:47:20 +0000208 <li>Write a runtime library or find an existing one which implements a GC
209 heap.<ol>
210 <li>Implement a memory allocator.</li>
211 <li>Design a binary interface for frame descriptors, used to identify
212 references within a stack frame.*</li>
213 <li>Implement a stack crawler to discover functions on the call stack.*</li>
214 <li>Implement a registry for global roots.</li>
215 <li>Design a binary interface for type descriptors, used to map references
216 within heap objects.</li>
217 <li>Implement a collection routine bringing together all of the above.</li>
218 </ol></li>
219 <li>Emit compatible code from your compiler.<ul>
220 <li>Initialization in the main function.</li>
221 <li>Use the <tt>gc "..."</tt> attribute to enable GC code generation
222 (or <tt>F.setGC("...")</tt>).</li>
223 <li>Use <tt>@llvm.gcroot</tt> to mark stack roots.</li>
224 <li>Use <tt>@llvm.gcread</tt> and/or <tt>@llvm.gcwrite</tt> to
225 manipulate GC references, if necessary.</li>
226 <li>Allocate memory using the GC allocation routine provided by the
227 runtime library.</li>
228 <li>Generate type descriptors according to your runtime's binary interface.</li>
229 </ul></li>
230 <li>Write a compiler plugin to interface LLVM with the runtime library.*<ul>
231 <li>Lower <tt>@llvm.gcread</tt> and <tt>@llvm.gcwrite</tt> to appropriate
232 code sequences.*</li>
233 <li>Generate stack maps according to the runtime's binary interface.*</li>
234 </ul></li>
235 <li>Load the plugin into the compiler. Use <tt>llc -load</tt> or link the
236 plugin statically with your language's compiler.*</li>
237 <li>Link program executables with the runtime.</li>
Gordon Henriksen326e24f2007-09-27 19:31:36 +0000238</ul>
239
Gordon Henriksenc9c0b592009-03-02 03:47:20 +0000240<p>To help with several of these tasks (those indicated with a *), LLVM
241includes a highly portable, built-in ShadowStack code generator. It is compiled
242into <tt>llc</tt> and works even with the interpreter and C backends.</p>
Chris Lattner0d8c2db2004-05-23 21:02:20 +0000243
244</div>
245
246<!-- ======================================================================= -->
247<div class="doc_subsection">
Gordon Henriksenc9c0b592009-03-02 03:47:20 +0000248 <a name="quickstart-compiler">In your compiler</a>
Gordon Henriksen326e24f2007-09-27 19:31:36 +0000249</div>
250
Gordon Henriksen326e24f2007-09-27 19:31:36 +0000251<div class="doc_text">
252
Gordon Henriksenc9c0b592009-03-02 03:47:20 +0000253<p>To turn the shadow stack on for your functions, first call:</p>
Gordon Henriksen326e24f2007-09-27 19:31:36 +0000254
Gordon Henriksenc9c0b592009-03-02 03:47:20 +0000255<div class="doc_code"><pre
256>F.setGC("shadow-stack");</pre></div>
Gordon Henriksen326e24f2007-09-27 19:31:36 +0000257
Gordon Henriksenc9c0b592009-03-02 03:47:20 +0000258<p>for each function your compiler emits. Since the shadow stack is built into
259LLVM, you do not need to load a plugin.</p>
260
261<p>Your compiler must also use <tt>@llvm.gcroot</tt> as documented.
262Don't forget to create a root for each intermediate value that is generated
263when evaluating an expression. In <tt>h(f(), g())</tt>, the result of
264<tt>f()</tt> could easily be collected if evaluating <tt>g()</tt> triggers a
265collection.</p>
266
267<p>There's no need to use <tt>@llvm.gcread</tt> and <tt>@llvm.gcwrite</tt> over
268plain <tt>load</tt> and <tt>store</tt> for now. You will need them when
269switching to a more advanced GC.</p>
Gordon Henriksen326e24f2007-09-27 19:31:36 +0000270
271</div>
272
273<!-- ======================================================================= -->
274<div class="doc_subsection">
Gordon Henriksenc9c0b592009-03-02 03:47:20 +0000275 <a name="quickstart-runtime">In your runtime</a>
Chris Lattner0d8c2db2004-05-23 21:02:20 +0000276</div>
277
278<div class="doc_text">
279
Gordon Henriksenc9c0b592009-03-02 03:47:20 +0000280<p>The shadow stack doesn't imply a memory allocation algorithm. A semispace
281collector or building atop <tt>malloc</tt> are great places to start, and can
282be implemented with very little code.</p>
Gordon Henriksen326e24f2007-09-27 19:31:36 +0000283
Gordon Henriksenc9c0b592009-03-02 03:47:20 +0000284<p>When it comes time to collect, however, your runtime needs to traverse the
285stack roots, and for this it needs to integrate with the shadow stack. Luckily,
286doing so is very simple. (This code is heavily commented to help you
287understand the data structure, but there are only 20 lines of meaningful
288code.)</p>
Gordon Henriksen326e24f2007-09-27 19:31:36 +0000289
290</div>
291
Gordon Henriksenc9c0b592009-03-02 03:47:20 +0000292<div class="doc_code"><pre
293>/// @brief A constant shadow stack frame descriptor. The compiler emits one of
294/// these for each function.
295///
296/// Storage of metadata values is elided if the %meta parameter to @llvm.gcroot
297/// is null.
298struct FrameMap {
299 int32_t NumRoots; //&lt; Number of roots in stack frame.
300 int32_t NumMeta; //&lt; Number of metadata descriptors. May be &lt; NumRoots.
301 const void *Meta[0]; //&lt; Metadata for each root.
302};
303
304/// @brief A link in the dynamic shadow stack. One of these is embedded in the
305/// stack frame of each function on the call stack.
306struct StackEntry {
307 StackEntry *Next; //&lt; Link to next stack entry (the caller's).
308 const FrameMap *Map; //&lt; Pointer to constant FrameMap.
309 void *Roots[0]; //&lt; Stack roots (in-place array).
310};
311
312/// @brief The head of the singly-linked list of StackEntries. Functions push
313/// and pop onto this in their prologue and epilogue.
314///
315/// Since there is only a global list, this technique is not threadsafe.
316StackEntry *llvm_gc_root_chain;
317
318/// @brief Calls Visitor(root, meta) for each GC root on the stack.
319/// root and meta are exactly the values passed to
320/// <tt>@llvm.gcroot</tt>.
321///
322/// Visitor could be a function to recursively mark live objects. Or it
323/// might copy them to another heap or generation.
324///
325/// @param Visitor A function to invoke for every GC root on the stack.
326void visitGCRoots(void (*Visitor)(void **Root, const void *Meta)) {
327 for (StackEntry *R = llvm_gc_root_chain; R; R = R->Next) {
328 unsigned i = 0;
329
330 // For roots [0, NumMeta), the metadata pointer is in the FrameMap.
331 for (unsigned e = R->Map->NumMeta; i != e; ++i)
332 Visitor(&R->Roots[i], R->Map->Meta[i]);
333
334 // For roots [NumMeta, NumRoots), the metadata pointer is null.
335 for (unsigned e = R->Map->NumRoots; i != e; ++i)
336 Visitor(&R->Roots[i], NULL);
337 }
338}</pre></div>
339
Gordon Henriksen326e24f2007-09-27 19:31:36 +0000340<!-- ======================================================================= -->
341<div class="doc_subsection">
Gordon Henriksenc9c0b592009-03-02 03:47:20 +0000342 <a name="shadow-stack">About the shadow stack</a>
Gordon Henriksen326e24f2007-09-27 19:31:36 +0000343</div>
344
Gordon Henriksen326e24f2007-09-27 19:31:36 +0000345<div class="doc_text">
346
Gordon Henriksenc9c0b592009-03-02 03:47:20 +0000347<p>Unlike many GC algorithms which rely on a cooperative code generator to
348generate stack maps, this algorithm carefully maintains a linked list of stack
349root descriptors [<a href="#henderson02">Henderson2002</a>]. This so-called
350"shadow stack" mirrors the machine stack. Maintaining this data structure is
351slower than using stack maps, but has a significant portability advantage
352because it requires no special support from the target code generator.</p>
Gordon Henriksen326e24f2007-09-27 19:31:36 +0000353
Gordon Henriksenc9c0b592009-03-02 03:47:20 +0000354<p>The tradeoff for this simplicity and portability is:</p>
355
356<ul>
357 <li>High overhead per function call.</li>
358 <li>Not thread-safe.</li>
359</ul>
360
361<p>Still, it's an easy way to get started.</p>
Gordon Henriksen326e24f2007-09-27 19:31:36 +0000362
363</div>
364
Gordon Henriksen326e24f2007-09-27 19:31:36 +0000365<!-- *********************************************************************** -->
366<div class="doc_section">
Gordon Henriksenc9c0b592009-03-02 03:47:20 +0000367 <a name="core">IR features</a><a name="intrinsics"></a>
Gordon Henriksen326e24f2007-09-27 19:31:36 +0000368</div>
369<!-- *********************************************************************** -->
370
371<div class="doc_text">
372
373<p>This section describes the garbage collection facilities provided by the
Gordon Henriksenc9c0b592009-03-02 03:47:20 +0000374<a href="LangRef.html">LLVM intermediate representation</a>. The exact behavior
375of these IR features is specified by the binary interface implemented by a
376<a href="#plugin">code generation plugin</a>, not by this document.</p>
Gordon Henriksen326e24f2007-09-27 19:31:36 +0000377
Gordon Henriksenc9c0b592009-03-02 03:47:20 +0000378<p>These facilities are limited to those strictly necessary; they are not
379intended to be a complete interface to any garbage collector. A program will
380need to interface with the GC library using the facilities provided by that
381program.</p>
Gordon Henriksen326e24f2007-09-27 19:31:36 +0000382
383</div>
384
385<!-- ======================================================================= -->
386<div class="doc_subsection">
Gordon Henriksenad93c4f2007-12-11 00:30:17 +0000387 <a name="gcattr">Specifying GC code generation: <tt>gc "..."</tt></a>
388</div>
389
390<div class="doc_code"><tt>
Gordon Henriksenc9c0b592009-03-02 03:47:20 +0000391 define <i>ty</i> @<i>name</i>(...) <u>gc "<i>name</i>"</u> { ...
Gordon Henriksenad93c4f2007-12-11 00:30:17 +0000392</tt></div>
393
394<div class="doc_text">
395
Gordon Henriksenc9c0b592009-03-02 03:47:20 +0000396<p>The <tt>gc</tt> function attribute is used to specify the desired GC style
397to the compiler. Its programmatic equivalent is the <tt>setGC</tt> method of
398<tt>Function</tt>.</p>
Gordon Henriksenad93c4f2007-12-11 00:30:17 +0000399
Gordon Henriksenc9c0b592009-03-02 03:47:20 +0000400<p>Setting <tt>gc "<i>name</i>"</tt> on a function triggers a search for a
401matching code generation plugin "<i>name</i>"; it is that plugin which defines
402the exact nature of the code generated to support GC. If none is found, the
403compiler will raise an error.</p>
404
405<p>Specifying the GC style on a per-function basis allows LLVM to link together
406programs that use different garbage collection algorithms (or none at all).</p>
Gordon Henriksenad93c4f2007-12-11 00:30:17 +0000407
408</div>
409
410<!-- ======================================================================= -->
411<div class="doc_subsection">
Gordon Henriksen326e24f2007-09-27 19:31:36 +0000412 <a name="gcroot">Identifying GC roots on the stack: <tt>llvm.gcroot</tt></a>
413</div>
414
Chris Lattner0d8c2db2004-05-23 21:02:20 +0000415<div class="doc_code"><tt>
Chris Lattner17340552008-04-24 06:00:30 +0000416 void @llvm.gcroot(i8** %ptrloc, i8* %metadata)
Chris Lattner0d8c2db2004-05-23 21:02:20 +0000417</tt></div>
418
Gordon Henriksen326e24f2007-09-27 19:31:36 +0000419<div class="doc_text">
Chris Lattner0d8c2db2004-05-23 21:02:20 +0000420
Gordon Henriksenc9c0b592009-03-02 03:47:20 +0000421<p>The <tt>llvm.gcroot</tt> intrinsic is used to inform LLVM that a stack
422variable references an object on the heap and is to be tracked for garbage
423collection. The exact impact on generated code is specified by a <a
424href="#plugin">compiler plugin</a>.</p>
425
426<p>A compiler which uses mem2reg to raise imperative code using <tt>alloca</tt>
427into SSA form need only add a call to <tt>@llvm.gcroot</tt> for those variables
428which a pointers into the GC heap.</p>
429
430<p>It is also important to mark intermediate values with <tt>llvm.gcroot</tt>.
431For example, consider <tt>h(f(), g())</tt>. Beware leaking the result of
432<tt>f()</tt> in the case that <tt>g()</tt> triggers a collection.</p>
433
434<p>The first argument <b>must</b> be a value referring to an alloca instruction
Gordon Henriksen326e24f2007-09-27 19:31:36 +0000435or a bitcast of an alloca. The second contains a pointer to metadata that
436should be associated with the pointer, and <b>must</b> be a constant or global
437value address. If your target collector uses tags, use a null pointer for
438metadata.</p>
439
Gordon Henriksenc9c0b592009-03-02 03:47:20 +0000440<p>The <tt>%metadata</tt> argument can be used to avoid requiring heap objects
441to have 'isa' pointers or tag bits. [<a href="#appel89">Appel89</a>, <a
442href="#goldberg91">Goldberg91</a>, <a href="#tolmach94">Tolmach94</a>] If
443specified, its value will be tracked along with the location of the pointer in
444the stack frame.</p>
445
Gordon Henriksen326e24f2007-09-27 19:31:36 +0000446<p>Consider the following fragment of Java code:</p>
Chris Lattner0d8c2db2004-05-23 21:02:20 +0000447
448<pre>
449 {
450 Object X; // A null-initialized reference to an object
451 ...
452 }
453</pre>
454
Gordon Henriksen326e24f2007-09-27 19:31:36 +0000455<p>This block (which may be located in the middle of a function or in a loop
456nest), could be compiled to this LLVM code:</p>
Chris Lattner0d8c2db2004-05-23 21:02:20 +0000457
458<pre>
459Entry:
460 ;; In the entry block for the function, allocate the
461 ;; stack space for X, which is an LLVM pointer.
462 %X = alloca %Object*
Gordon Henriksen326e24f2007-09-27 19:31:36 +0000463
464 ;; Tell LLVM that the stack space is a stack root.
465 ;; Java has type-tags on objects, so we pass null as metadata.
466 %tmp = bitcast %Object** %X to i8**
Chris Lattner17340552008-04-24 06:00:30 +0000467 call void @llvm.gcroot(i8** %X, i8* null)
Chris Lattner0d8c2db2004-05-23 21:02:20 +0000468 ...
469
470 ;; "CodeBlock" is the block corresponding to the start
Reid Spencer03d186a2004-05-25 08:45:31 +0000471 ;; of the scope above.
Chris Lattner0d8c2db2004-05-23 21:02:20 +0000472CodeBlock:
Gordon Henriksen326e24f2007-09-27 19:31:36 +0000473 ;; Java null-initializes pointers.
474 store %Object* null, %Object** %X
475
Chris Lattner0d8c2db2004-05-23 21:02:20 +0000476 ...
477
478 ;; As the pointer goes out of scope, store a null value into
479 ;; it, to indicate that the value is no longer live.
480 store %Object* null, %Object** %X
481 ...
482</pre>
483
484</div>
485
486<!-- ======================================================================= -->
487<div class="doc_subsection">
Gordon Henriksen326e24f2007-09-27 19:31:36 +0000488 <a name="barriers">Reading and writing references in the heap</a>
Chris Lattner0d8c2db2004-05-23 21:02:20 +0000489</div>
490
491<div class="doc_text">
492
Gordon Henriksen326e24f2007-09-27 19:31:36 +0000493<p>Some collectors need to be informed when the mutator (the program that needs
494garbage collection) either reads a pointer from or writes a pointer to a field
495of a heap object. The code fragments inserted at these points are called
496<em>read barriers</em> and <em>write barriers</em>, respectively. The amount of
497code that needs to be executed is usually quite small and not on the critical
498path of any computation, so the overall performance impact of the barrier is
499tolerable.</p>
Chris Lattner0d8c2db2004-05-23 21:02:20 +0000500
Gordon Henriksen326e24f2007-09-27 19:31:36 +0000501<p>Barriers often require access to the <em>object pointer</em> rather than the
502<em>derived pointer</em> (which is a pointer to the field within the
503object). Accordingly, these intrinsics take both pointers as separate arguments
504for completeness. In this snippet, <tt>%object</tt> is the object pointer, and
505<tt>%derived</tt> is the derived pointer:</p>
506
Chris Lattner05d67092008-04-24 05:59:56 +0000507<blockquote><pre>
508 ;; An array type.
Gordon Henriksen326e24f2007-09-27 19:31:36 +0000509 %class.Array = type { %class.Object, i32, [0 x %class.Object*] }
Chris Lattner05d67092008-04-24 05:59:56 +0000510 ...
Gordon Henriksen326e24f2007-09-27 19:31:36 +0000511
512 ;; Load the object pointer from a gcroot.
513 %object = load %class.Array** %object_addr
514
515 ;; Compute the derived pointer.
Chris Lattner05d67092008-04-24 05:59:56 +0000516 %derived = getelementptr %object, i32 0, i32 2, i32 %n</pre></blockquote>
Chris Lattner0d8c2db2004-05-23 21:02:20 +0000517
Gordon Henriksenc9c0b592009-03-02 03:47:20 +0000518<p>The use of these intrinsics is naturally optional if the target GC does
519require the corresponding barrier. If so, the GC plugin will replace the
520intrinsic calls with the corresponding <tt>load</tt> or <tt>store</tt>
521instruction if they are used.</p>
522
Chris Lattner0d8c2db2004-05-23 21:02:20 +0000523</div>
524
525<!-- ======================================================================= -->
Gordon Henriksen326e24f2007-09-27 19:31:36 +0000526<div class="doc_subsubsection">
527 <a name="gcwrite">Write barrier: <tt>llvm.gcwrite</tt></a>
Chris Lattner0d8c2db2004-05-23 21:02:20 +0000528</div>
529
Gordon Henriksen326e24f2007-09-27 19:31:36 +0000530<div class="doc_code"><tt>
531void @llvm.gcwrite(i8* %value, i8* %object, i8** %derived)
532</tt></div>
533
Chris Lattner0d8c2db2004-05-23 21:02:20 +0000534<div class="doc_text">
535
Gordon Henriksen326e24f2007-09-27 19:31:36 +0000536<p>For write barriers, LLVM provides the <tt>llvm.gcwrite</tt> intrinsic
537function. It has exactly the same semantics as a non-volatile <tt>store</tt> to
Gordon Henriksenc9c0b592009-03-02 03:47:20 +0000538the derived pointer (the third argument). The exact code generated is specified
539by a <a href="#plugin">compiler plugin</a>.</p>
Gordon Henriksen326e24f2007-09-27 19:31:36 +0000540
541<p>Many important algorithms require write barriers, including generational
542and concurrent collectors. Additionally, write barriers could be used to
543implement reference counting.</p>
544
Gordon Henriksen326e24f2007-09-27 19:31:36 +0000545</div>
546
547<!-- ======================================================================= -->
548<div class="doc_subsubsection">
549 <a name="gcread">Read barrier: <tt>llvm.gcread</tt></a>
550</div>
551
Chris Lattner0d8c2db2004-05-23 21:02:20 +0000552<div class="doc_code"><tt>
Gordon Henriksen326e24f2007-09-27 19:31:36 +0000553i8* @llvm.gcread(i8* %object, i8** %derived)<br>
Chris Lattner0d8c2db2004-05-23 21:02:20 +0000554</tt></div>
555
Gordon Henriksen326e24f2007-09-27 19:31:36 +0000556<div class="doc_text">
Chris Lattner0d8c2db2004-05-23 21:02:20 +0000557
Gordon Henriksen326e24f2007-09-27 19:31:36 +0000558<p>For read barriers, LLVM provides the <tt>llvm.gcread</tt> intrinsic function.
559It has exactly the same semantics as a non-volatile <tt>load</tt> from the
Gordon Henriksenc9c0b592009-03-02 03:47:20 +0000560derived pointer (the second argument). The exact code generated is specified by
561a <a href="#plugin">compiler plugin</a>.</p>
Chris Lattner0d8c2db2004-05-23 21:02:20 +0000562
Gordon Henriksen326e24f2007-09-27 19:31:36 +0000563<p>Read barriers are needed by fewer algorithms than write barriers, and may
564have a greater performance impact since pointer reads are more frequent than
565writes.</p>
566
Gordon Henriksen326e24f2007-09-27 19:31:36 +0000567</div>
568
569<!-- *********************************************************************** -->
570<div class="doc_section">
Gordon Henriksen326e24f2007-09-27 19:31:36 +0000571 <a name="plugin">Implementing a collector plugin</a>
572</div>
573<!-- *********************************************************************** -->
574
575<div class="doc_text">
576
Gordon Henriksen01571ef2008-08-24 03:18:23 +0000577<p>User code specifies which GC code generation to use with the <tt>gc</tt>
578function attribute or, equivalently, with the <tt>setGC</tt> method of
Gordon Henriksenad93c4f2007-12-11 00:30:17 +0000579<tt>Function</tt>.</p>
580
Gordon Henriksen01571ef2008-08-24 03:18:23 +0000581<p>To implement a GC plugin, it is necessary to subclass
582<tt>llvm::GCStrategy</tt>, which can be accomplished in a few lines of
Gordon Henriksen326e24f2007-09-27 19:31:36 +0000583boilerplate code. LLVM's infrastructure provides access to several important
584algorithms. For an uncontroversial collector, all that remains may be to emit
585the assembly code for the collector's unique stack map data structure, which
586might be accomplished in as few as 100 LOC.</p>
587
Gordon Henriksen01571ef2008-08-24 03:18:23 +0000588<p>This is not the appropriate place to implement a garbage collected heap or a
589garbage collector itself. That code should exist in the language's runtime
Gordon Henriksenc9c0b592009-03-02 03:47:20 +0000590library. The compiler plugin is responsible for generating code which
591conforms to the binary interface defined by library, most essentially the
592<a href="stack-map">stack map</a>.</p>
Gordon Henriksen326e24f2007-09-27 19:31:36 +0000593
Gordon Henriksen01571ef2008-08-24 03:18:23 +0000594<p>To subclass <tt>llvm::GCStrategy</tt> and register it with the compiler:</p>
Gordon Henriksen326e24f2007-09-27 19:31:36 +0000595
Gordon Henriksen01571ef2008-08-24 03:18:23 +0000596<blockquote><pre>// lib/MyGC/MyGC.cpp - Example LLVM GC plugin
597
598#include "llvm/CodeGen/GCStrategy.h"
599#include "llvm/CodeGen/GCMetadata.h"
Gordon Henriksen326e24f2007-09-27 19:31:36 +0000600#include "llvm/Support/Compiler.h"
601
602using namespace llvm;
603
604namespace {
Gordon Henriksen01571ef2008-08-24 03:18:23 +0000605 class VISIBILITY_HIDDEN MyGC : public GCStrategy {
Gordon Henriksen326e24f2007-09-27 19:31:36 +0000606 public:
Gordon Henriksen01571ef2008-08-24 03:18:23 +0000607 MyGC() {}
Gordon Henriksen326e24f2007-09-27 19:31:36 +0000608 };
609
Gordon Henriksen01571ef2008-08-24 03:18:23 +0000610 GCRegistry::Add&lt;MyGC&gt;
Gordon Henriksenad93c4f2007-12-11 00:30:17 +0000611 X("mygc", "My bespoke garbage collector.");
Gordon Henriksen326e24f2007-09-27 19:31:36 +0000612}</pre></blockquote>
613
614<p>Using the LLVM makefiles (like the <a
615href="http://llvm.org/viewvc/llvm-project/llvm/trunk/projects/sample/">sample
616project</a>), this can be built into a plugin using a simple makefile:</p>
617
618<blockquote><pre
619># lib/MyGC/Makefile
620
621LEVEL := ../..
622LIBRARYNAME = <var>MyGC</var>
623LOADABLE_MODULE = 1
624
625include $(LEVEL)/Makefile.common</pre></blockquote>
626
Gordon Henriksenad93c4f2007-12-11 00:30:17 +0000627<p>Once the plugin is compiled, code using it may be compiled using <tt>llc
628-load=<var>MyGC.so</var></tt> (though <var>MyGC.so</var> may have some other
629platform-specific extension):</p>
Gordon Henriksen326e24f2007-09-27 19:31:36 +0000630
631<blockquote><pre
Gordon Henriksenad93c4f2007-12-11 00:30:17 +0000632>$ cat sample.ll
633define void @f() gc "mygc" {
634entry:
635 ret void
636}
637$ llvm-as &lt; sample.ll | llc -load=MyGC.so</pre></blockquote>
638
639<p>It is also possible to statically link the collector plugin into tools, such
640as a language-specific compiler front-end.</p>
Gordon Henriksen326e24f2007-09-27 19:31:36 +0000641
642</div>
643
644<!-- ======================================================================= -->
645<div class="doc_subsection">
646 <a name="collector-algos">Overview of available features</a>
647</div>
648
649<div class="doc_text">
650
651<p>The boilerplate collector above does nothing. More specifically:</p>
652
653<ul>
654 <li><tt>llvm.gcread</tt> calls are replaced with the corresponding
655 <tt>load</tt> instruction.</li>
656 <li><tt>llvm.gcwrite</tt> calls are replaced with the corresponding
657 <tt>store</tt> instruction.</li>
658 <li>No stack map is emitted, and no safe points are added.</li>
659</ul>
660
661<p><tt>Collector</tt> provides a range of features through which a plugin
662collector may do useful work. This matrix summarizes the supported (and planned)
663features and correlates them with the collection techniques which typically
664require them.</p>
665
666<table>
667 <tr>
668 <th>Algorithm</th>
669 <th>Done</th>
670 <th>shadow stack</th>
671 <th>refcount</th>
672 <th>mark-sweep</th>
673 <th>copying</th>
674 <th>incremental</th>
675 <th>threaded</th>
676 <th>concurrent</th>
677 </tr>
678 <tr>
679 <th class="rowhead"><a href="#stack-map">stack map</a></th>
680 <td>&#10004;</td>
681 <td></td>
682 <td></td>
683 <td>&#10008;</td>
684 <td>&#10008;</td>
685 <td>&#10008;</td>
686 <td>&#10008;</td>
687 <td>&#10008;</td>
688 </tr>
689 <tr>
690 <th class="rowhead"><a href="#init-roots">initialize roots</a></th>
691 <td>&#10004;</td>
692 <td>&#10008;</td>
693 <td>&#10008;</td>
694 <td>&#10008;</td>
695 <td>&#10008;</td>
696 <td>&#10008;</td>
697 <td>&#10008;</td>
698 <td>&#10008;</td>
699 </tr>
700 <tr class="doc_warning">
701 <th class="rowhead">derived pointers</th>
702 <td>NO</td>
703 <td></td>
704 <td></td>
705 <td></td>
706 <td></td>
707 <td></td>
708 <td>&#10008;*</td>
709 <td>&#10008;*</td>
710 </tr>
711 <tr>
712 <th class="rowhead"><em><a href="#custom">custom lowering</a></em></th>
713 <td>&#10004;</td>
714 <th></th>
715 <th></th>
716 <th></th>
717 <th></th>
718 <th></th>
719 <th></th>
720 <th></th>
721 </tr>
722 <tr>
723 <th class="rowhead indent">gcroot</th>
724 <td>&#10004;</td>
725 <td>&#10008;</td>
726 <td>&#10008;</td>
727 <td></td>
728 <td></td>
729 <td></td>
730 <td></td>
731 <td></td>
732 </tr>
733 <tr>
734 <th class="rowhead indent">gcwrite</th>
735 <td>&#10004;</td>
736 <td></td>
737 <td>&#10008;</td>
738 <td></td>
739 <td></td>
740 <td>&#10008;</td>
741 <td></td>
742 <td>&#10008;</td>
743 </tr>
744 <tr>
745 <th class="rowhead indent">gcread</th>
746 <td>&#10004;</td>
747 <td></td>
748 <td></td>
749 <td></td>
750 <td></td>
751 <td></td>
752 <td></td>
753 <td>&#10008;</td>
754 </tr>
755 <tr>
756 <th class="rowhead"><em><a href="#safe-points">safe points</a></em></th>
757 <td></td>
758 <th></th>
759 <th></th>
760 <th></th>
761 <th></th>
762 <th></th>
763 <th></th>
764 <th></th>
765 </tr>
766 <tr>
767 <th class="rowhead indent">in calls</th>
768 <td>&#10004;</td>
769 <td></td>
770 <td></td>
771 <td>&#10008;</td>
772 <td>&#10008;</td>
773 <td>&#10008;</td>
774 <td>&#10008;</td>
775 <td>&#10008;</td>
776 </tr>
777 <tr>
778 <th class="rowhead indent">before calls</th>
779 <td>&#10004;</td>
780 <td></td>
781 <td></td>
782 <td></td>
783 <td></td>
784 <td></td>
785 <td>&#10008;</td>
786 <td>&#10008;</td>
787 </tr>
788 <tr class="doc_warning">
789 <th class="rowhead indent">for loops</th>
790 <td>NO</td>
791 <td></td>
792 <td></td>
793 <td></td>
794 <td></td>
795 <td></td>
796 <td>&#10008;</td>
797 <td>&#10008;</td>
798 </tr>
799 <tr>
800 <th class="rowhead indent">before escape</th>
801 <td>&#10004;</td>
802 <td></td>
803 <td></td>
804 <td></td>
805 <td></td>
806 <td></td>
807 <td>&#10008;</td>
808 <td>&#10008;</td>
809 </tr>
810 <tr class="doc_warning">
811 <th class="rowhead">emit code at safe points</th>
812 <td>NO</td>
813 <td></td>
814 <td></td>
815 <td></td>
816 <td></td>
817 <td></td>
818 <td>&#10008;</td>
819 <td>&#10008;</td>
820 </tr>
821 <tr>
822 <th class="rowhead"><em>output</em></th>
823 <td></td>
824 <th></th>
825 <th></th>
826 <th></th>
827 <th></th>
828 <th></th>
829 <th></th>
830 <th></th>
831 </tr>
832 <tr>
833 <th class="rowhead indent"><a href="#assembly">assembly</a></th>
834 <td>&#10004;</td>
835 <td></td>
836 <td></td>
837 <td>&#10008;</td>
838 <td>&#10008;</td>
839 <td>&#10008;</td>
840 <td>&#10008;</td>
841 <td>&#10008;</td>
842 </tr>
843 <tr class="doc_warning">
844 <th class="rowhead indent">JIT</th>
845 <td>NO</td>
846 <td></td>
847 <td></td>
848 <td class="optl">&#10008;</td>
849 <td class="optl">&#10008;</td>
850 <td class="optl">&#10008;</td>
851 <td class="optl">&#10008;</td>
852 <td class="optl">&#10008;</td>
853 </tr>
854 <tr class="doc_warning">
855 <th class="rowhead indent">obj</th>
856 <td>NO</td>
857 <td></td>
858 <td></td>
859 <td class="optl">&#10008;</td>
860 <td class="optl">&#10008;</td>
861 <td class="optl">&#10008;</td>
862 <td class="optl">&#10008;</td>
863 <td class="optl">&#10008;</td>
864 </tr>
865 <tr class="doc_warning">
866 <th class="rowhead">live analysis</th>
867 <td>NO</td>
868 <td></td>
869 <td></td>
870 <td class="optl">&#10008;</td>
871 <td class="optl">&#10008;</td>
872 <td class="optl">&#10008;</td>
873 <td class="optl">&#10008;</td>
874 <td class="optl">&#10008;</td>
875 </tr>
876 <tr class="doc_warning">
877 <th class="rowhead">register map</th>
878 <td>NO</td>
879 <td></td>
880 <td></td>
881 <td class="optl">&#10008;</td>
882 <td class="optl">&#10008;</td>
883 <td class="optl">&#10008;</td>
884 <td class="optl">&#10008;</td>
885 <td class="optl">&#10008;</td>
886 </tr>
887 <tr>
888 <td colspan="10">
889 <div><span class="doc_warning">*</span> Derived pointers only pose a
890 hazard to copying collectors.</div>
891 <div><span class="optl">&#10008;</span> in gray denotes a feature which
892 could be utilized if available.</div>
893 </td>
894 </tr>
895</table>
896
897<p>To be clear, the collection techniques above are defined as:</p>
898
899<dl>
900 <dt>Shadow Stack</dt>
901 <dd>The mutator carefully maintains a linked list of stack root
902 descriptors.</dd>
903 <dt>Reference Counting</dt>
904 <dd>The mutator maintains a reference count for each object and frees an
905 object when its count falls to zero.</dd>
906 <dt>Mark-Sweep</dt>
907 <dd>When the heap is exhausted, the collector marks reachable objects starting
908 from the roots, then deallocates unreachable objects in a sweep
909 phase.</dd>
910 <dt>Copying</dt>
911 <dd>As reachability analysis proceeds, the collector copies objects from one
912 heap area to another, compacting them in the process. Copying collectors
913 enable highly efficient "bump pointer" allocation and can improve locality
914 of reference.</dd>
915 <dt>Incremental</dt>
916 <dd>(Including generational collectors.) Incremental collectors generally have
917 all the properties of a copying collector (regardless of whether the
918 mature heap is compacting), but bring the added complexity of requiring
919 write barriers.</dd>
920 <dt>Threaded</dt>
921 <dd>Denotes a multithreaded mutator; the collector must still stop the mutator
922 ("stop the world") before beginning reachability analysis. Stopping a
923 multithreaded mutator is a complicated problem. It generally requires
924 highly platform specific code in the runtime, and the production of
925 carefully designed machine code at safe points.</dd>
926 <dt>Concurrent</dt>
927 <dd>In this technique, the mutator and the collector run concurrently, with
928 the goal of eliminating pause times. In a <em>cooperative</em> collector,
929 the mutator further aids with collection should a pause occur, allowing
930 collection to take advantage of multiprocessor hosts. The "stop the world"
931 problem of threaded collectors is generally still present to a limited
932 extent. Sophisticated marking algorithms are necessary. Read barriers may
933 be necessary.</dd>
934</dl>
935
936<p>As the matrix indicates, LLVM's garbage collection infrastructure is already
937suitable for a wide variety of collectors, but does not currently extend to
938multithreaded programs. This will be added in the future as there is
939interest.</p>
940
941</div>
942
943<!-- ======================================================================= -->
944<div class="doc_subsection">
945 <a name="stack-map">Computing stack maps</a>
946</div>
947
948<div class="doc_text">
949
950<blockquote><pre
Gordon Henriksenad93c4f2007-12-11 00:30:17 +0000951>for (iterator I = begin(), E = end(); I != E; ++I) {
Gordon Henriksen01571ef2008-08-24 03:18:23 +0000952 GCFunctionInfo *FI = *I;
953 unsigned FrameSize = FI-&gt;getFrameSize();
954 size_t RootCount = FI-&gt;roots_size();
Gordon Henriksen326e24f2007-09-27 19:31:36 +0000955
Gordon Henriksen01571ef2008-08-24 03:18:23 +0000956 for (GCFunctionInfo::roots_iterator RI = FI-&gt;roots_begin(),
957 RE = FI-&gt;roots_end();
958 RI != RE; ++RI) {
Gordon Henriksenad93c4f2007-12-11 00:30:17 +0000959 int RootNum = RI->Num;
960 int RootStackOffset = RI->StackOffset;
961 Constant *RootMetadata = RI->Metadata;
962 }
Gordon Henriksen326e24f2007-09-27 19:31:36 +0000963}</pre></blockquote>
964
Gordon Henriksen01571ef2008-08-24 03:18:23 +0000965<p>LLVM automatically computes a stack map. All a <tt>GCStrategy</tt> needs to do
966is access it using <tt>GCFunctionMetadata::roots_begin()</tt> and
Gordon Henriksen326e24f2007-09-27 19:31:36 +0000967-<tt>end()</tt>. If the <tt>llvm.gcroot</tt> intrinsic is eliminated before code
968generation by a custom lowering pass, LLVM's stack map will be empty.</p>
969
970</div>
971
972
973<!-- ======================================================================= -->
974<div class="doc_subsection">
975 <a name="init-roots">Initializing roots to null: <tt>InitRoots</tt></a>
976</div>
977
978<div class="doc_text">
979
980<blockquote><pre
Gordon Henriksen01571ef2008-08-24 03:18:23 +0000981>MyGC::MyGC() {
Gordon Henriksen326e24f2007-09-27 19:31:36 +0000982 InitRoots = true;
983}</pre></blockquote>
984
985<p>When set, LLVM will automatically initialize each root to <tt>null</tt> upon
Gordon Henriksen01571ef2008-08-24 03:18:23 +0000986entry to the function. This prevents the GC's sweep phase from visiting
987uninitialized pointers, which will almost certainly cause it to crash. This
988initialization occurs before custom lowering, so the two may be used
989together.</p>
Gordon Henriksen326e24f2007-09-27 19:31:36 +0000990
Gordon Henriksen01571ef2008-08-24 03:18:23 +0000991<p>Since LLVM does not yet compute liveness information, there is no means of
992distinguishing an uninitialized stack root from an initialized one. Therefore,
993this feature should be used by all GC plugins. It is enabled by default.</p>
Gordon Henriksen326e24f2007-09-27 19:31:36 +0000994
995</div>
996
997
998<!-- ======================================================================= -->
999<div class="doc_subsection">
1000 <a name="custom">Custom lowering of intrinsics: <tt>CustomRoots</tt>,
1001 <tt>CustomReadBarriers</tt>, and <tt>CustomWriteBarriers</tt></a>
1002</div>
1003
1004<div class="doc_text">
1005
Gordon Henriksen01571ef2008-08-24 03:18:23 +00001006<p>For GCs which use barriers or unusual treatment of stack roots, these
1007flags allow the collector to perform arbitrary transformations of the LLVM
Gordon Henriksen326e24f2007-09-27 19:31:36 +00001008IR:</p>
1009
1010<blockquote><pre
Gordon Henriksen01571ef2008-08-24 03:18:23 +00001011>class MyGC : public GCStrategy {
Gordon Henriksen326e24f2007-09-27 19:31:36 +00001012public:
Gordon Henriksen01571ef2008-08-24 03:18:23 +00001013 MyGC() {
Gordon Henriksen326e24f2007-09-27 19:31:36 +00001014 CustomRoots = true;
1015 CustomReadBarriers = true;
1016 CustomWriteBarriers = true;
1017 }
1018
Gordon Henriksenad93c4f2007-12-11 00:30:17 +00001019 virtual bool initializeCustomLowering(Module &amp;M);
1020 virtual bool performCustomLowering(Function &amp;F);
Gordon Henriksen326e24f2007-09-27 19:31:36 +00001021};</pre></blockquote>
1022
1023<p>If any of these flags are set, then LLVM suppresses its default lowering for
Gordon Henriksen01571ef2008-08-24 03:18:23 +00001024the corresponding intrinsics and instead calls
1025<tt>performCustomLowering</tt>.</p>
Gordon Henriksen326e24f2007-09-27 19:31:36 +00001026
1027<p>LLVM's default action for each intrinsic is as follows:</p>
1028
1029<ul>
1030 <li><tt>llvm.gcroot</tt>: Pass through to the code generator to generate a
1031 stack map.</li>
1032 <li><tt>llvm.gcread</tt>: Substitute a <tt>load</tt> instruction.</li>
1033 <li><tt>llvm.gcwrite</tt>: Substitute a <tt>store</tt> instruction.</li>
1034</ul>
1035
1036<p>If <tt>CustomReadBarriers</tt> or <tt>CustomWriteBarriers</tt> are specified,
Gordon Henriksenad93c4f2007-12-11 00:30:17 +00001037then <tt>performCustomLowering</tt> <strong>must</strong> eliminate the
1038corresponding barriers.</p>
Gordon Henriksen326e24f2007-09-27 19:31:36 +00001039
Gordon Henriksen01571ef2008-08-24 03:18:23 +00001040<p><tt>performCustomLowering</tt> must comply with the same restrictions as <a
1041href="WritingAnLLVMPass.html#runOnFunction"><tt
1042>FunctionPass::runOnFunction</tt></a>.
1043Likewise, <tt>initializeCustomLowering</tt> has the same semantics as <a
1044href="WritingAnLLVMPass.html#doInitialization_mod"><tt
1045>Pass::doInitialization(Module&amp;)</tt></a>.</p>
Gordon Henriksenad93c4f2007-12-11 00:30:17 +00001046
1047<p>The following can be used as a template:</p>
Gordon Henriksen326e24f2007-09-27 19:31:36 +00001048
1049<blockquote><pre
Gordon Henriksenad93c4f2007-12-11 00:30:17 +00001050>#include "llvm/Module.h"
Gordon Henriksen0adede02007-12-22 23:32:32 +00001051#include "llvm/IntrinsicInst.h"
Gordon Henriksen326e24f2007-09-27 19:31:36 +00001052
Gordon Henriksen01571ef2008-08-24 03:18:23 +00001053bool MyGC::initializeCustomLowering(Module &amp;M) {
Gordon Henriksenad93c4f2007-12-11 00:30:17 +00001054 return false;
1055}
Gordon Henriksen326e24f2007-09-27 19:31:36 +00001056
Gordon Henriksen01571ef2008-08-24 03:18:23 +00001057bool MyGC::performCustomLowering(Function &amp;F) {
Gordon Henriksenad93c4f2007-12-11 00:30:17 +00001058 bool MadeChange = false;
1059
1060 for (Function::iterator BB = F.begin(), E = F.end(); BB != E; ++BB)
Gordon Henriksen74f4ded2007-12-22 23:34:26 +00001061 for (BasicBlock::iterator II = BB-&gt;begin(), E = BB-&gt;end(); II != E; )
1062 if (IntrinsicInst *CI = dyn_cast&lt;IntrinsicInst&gt;(II++))
Gordon Henriksen0adede02007-12-22 23:32:32 +00001063 if (Function *F = CI-&gt;getCalledFunction())
1064 switch (F-&gt;getIntrinsicID()) {
1065 case Intrinsic::gcwrite:
Gordon Henriksenad93c4f2007-12-11 00:30:17 +00001066 // Handle llvm.gcwrite.
Gordon Henriksen0adede02007-12-22 23:32:32 +00001067 CI-&gt;eraseFromParent();
Gordon Henriksenad93c4f2007-12-11 00:30:17 +00001068 MadeChange = true;
Gordon Henriksen0adede02007-12-22 23:32:32 +00001069 break;
1070 case Intrinsic::gcread:
Gordon Henriksenad93c4f2007-12-11 00:30:17 +00001071 // Handle llvm.gcread.
Gordon Henriksen0adede02007-12-22 23:32:32 +00001072 CI-&gt;eraseFromParent();
Gordon Henriksenad93c4f2007-12-11 00:30:17 +00001073 MadeChange = true;
Gordon Henriksen0adede02007-12-22 23:32:32 +00001074 break;
1075 case Intrinsic::gcroot:
Gordon Henriksenad93c4f2007-12-11 00:30:17 +00001076 // Handle llvm.gcroot.
Gordon Henriksen0adede02007-12-22 23:32:32 +00001077 CI-&gt;eraseFromParent();
Gordon Henriksenad93c4f2007-12-11 00:30:17 +00001078 MadeChange = true;
Gordon Henriksen0adede02007-12-22 23:32:32 +00001079 break;
Gordon Henriksenad93c4f2007-12-11 00:30:17 +00001080 }
1081
1082 return MadeChange;
Gordon Henriksen326e24f2007-09-27 19:31:36 +00001083}</pre></blockquote>
1084
1085</div>
1086
1087
1088<!-- ======================================================================= -->
1089<div class="doc_subsection">
1090 <a name="safe-points">Generating safe points: <tt>NeededSafePoints</tt></a>
1091</div>
1092
1093<div class="doc_text">
1094
1095<p>LLVM can compute four kinds of safe points:</p>
1096
1097<blockquote><pre
1098>namespace GC {
1099 /// PointKind - The type of a collector-safe point.
1100 ///
1101 enum PointKind {
1102 Loop, //&lt; Instr is a loop (backwards branch).
1103 Return, //&lt; Instr is a return instruction.
1104 PreCall, //&lt; Instr is a call instruction.
1105 PostCall //&lt; Instr is the return address of a call.
1106 };
1107}</pre></blockquote>
1108
1109<p>A collector can request any combination of the four by setting the
1110<tt>NeededSafePoints</tt> mask:</p>
1111
1112<blockquote><pre
Gordon Henriksen01571ef2008-08-24 03:18:23 +00001113>MyGC::MyGC() {
Gordon Henriksen326e24f2007-09-27 19:31:36 +00001114 NeededSafePoints = 1 &lt;&lt; GC::Loop
1115 | 1 &lt;&lt; GC::Return
1116 | 1 &lt;&lt; GC::PreCall
1117 | 1 &lt;&lt; GC::PostCall;
1118}</pre></blockquote>
1119
1120<p>It can then use the following routines to access safe points.</p>
1121
Gordon Henriksenad93c4f2007-12-11 00:30:17 +00001122<blockquote><pre
1123>for (iterator I = begin(), E = end(); I != E; ++I) {
Gordon Henriksen01571ef2008-08-24 03:18:23 +00001124 GCFunctionInfo *MD = *I;
Gordon Henriksenad93c4f2007-12-11 00:30:17 +00001125 size_t PointCount = MD-&gt;size();
Gordon Henriksen326e24f2007-09-27 19:31:36 +00001126
Gordon Henriksen01571ef2008-08-24 03:18:23 +00001127 for (GCFunctionInfo::iterator PI = MD-&gt;begin(),
1128 PE = MD-&gt;end(); PI != PE; ++PI) {
Gordon Henriksenad93c4f2007-12-11 00:30:17 +00001129 GC::PointKind PointKind = PI-&gt;Kind;
1130 unsigned PointNum = PI-&gt;Num;
1131 }
1132}
1133</pre></blockquote>
Gordon Henriksen326e24f2007-09-27 19:31:36 +00001134
1135<p>Almost every collector requires <tt>PostCall</tt> safe points, since these
1136correspond to the moments when the function is suspended during a call to a
1137subroutine.</p>
1138
1139<p>Threaded programs generally require <tt>Loop</tt> safe points to guarantee
1140that the application will reach a safe point within a bounded amount of time,
1141even if it is executing a long-running loop which contains no function
1142calls.</p>
1143
1144<p>Threaded collectors may also require <tt>Return</tt> and <tt>PreCall</tt>
1145safe points to implement "stop the world" techniques using self-modifying code,
1146where it is important that the program not exit the function without reaching a
1147safe point (because only the topmost function has been patched).</p>
1148
1149</div>
1150
1151
1152<!-- ======================================================================= -->
1153<div class="doc_subsection">
Gordon Henriksen01571ef2008-08-24 03:18:23 +00001154 <a name="assembly">Emitting assembly code: <tt>GCMetadataPrinter</tt></a>
Gordon Henriksen326e24f2007-09-27 19:31:36 +00001155</div>
1156
1157<div class="doc_text">
1158
Gordon Henriksen01571ef2008-08-24 03:18:23 +00001159<p>LLVM allows a GC to print arbitrary assembly code before and after the rest
1160of a module's assembly code. At the end of the module, the GC can print stack
1161maps built by the code generator. (At the beginning, this information is not
1162yet computed.)</p>
1163
1164<p>Since AsmWriter and CodeGen are separate components of LLVM, a separate
1165abstract base class and registry is provided for printing assembly code, the
Gordon Henriksenc9c0b592009-03-02 03:47:20 +00001166<tt>GCMetadaPrinter</tt> and <tt>GCMetadataPrinterRegistry</tt>. The AsmWriter
Gordon Henriksen01571ef2008-08-24 03:18:23 +00001167will look for such a subclass if the <tt>GCStrategy</tt> sets
1168<tt>UsesMetadata</tt>:</p>
1169
1170<blockquote><pre
1171>MyGC::MyGC() {
1172 UsesMetadata = true;
1173}</pre></blockquote>
Gordon Henriksen326e24f2007-09-27 19:31:36 +00001174
Gordon Henriksenad93c4f2007-12-11 00:30:17 +00001175<p>Note that LLVM does not currently have analogous APIs to support code
1176generation in the JIT, nor using the object writers.</p>
Gordon Henriksen326e24f2007-09-27 19:31:36 +00001177
1178<blockquote><pre
Gordon Henriksen01571ef2008-08-24 03:18:23 +00001179>// lib/MyGC/MyGCPrinter.cpp - Example LLVM GC printer
Gordon Henriksen326e24f2007-09-27 19:31:36 +00001180
Gordon Henriksen01571ef2008-08-24 03:18:23 +00001181#include "llvm/CodeGen/GCMetadataPrinter.h"
1182#include "llvm/Support/Compiler.h"
1183
1184using namespace llvm;
1185
1186namespace {
1187 class VISIBILITY_HIDDEN MyGCPrinter : public GCMetadataPrinter {
1188 public:
1189 virtual void beginAssembly(std::ostream &amp;OS, AsmPrinter &amp;AP,
1190 const TargetAsmInfo &amp;TAI);
1191
1192 virtual void finishAssembly(std::ostream &amp;OS, AsmPrinter &amp;AP,
1193 const TargetAsmInfo &amp;TAI);
1194 };
1195
1196 GCMetadataPrinterRegistry::Add&lt;MyGCPrinter&gt;
1197 X("mygc", "My bespoke garbage collector.");
Gordon Henriksen326e24f2007-09-27 19:31:36 +00001198}</pre></blockquote>
1199
1200<p>The collector should use <tt>AsmPrinter</tt> and <tt>TargetAsmInfo</tt> to
Gordon Henriksenad93c4f2007-12-11 00:30:17 +00001201print portable assembly code to the <tt>std::ostream</tt>. The collector itself
1202contains the stack map for the entire module, and may access the
Gordon Henriksen01571ef2008-08-24 03:18:23 +00001203<tt>GCFunctionInfo</tt> using its own <tt>begin()</tt> and <tt>end()</tt>
Gordon Henriksenad93c4f2007-12-11 00:30:17 +00001204methods. Here's a realistic example:</p>
Gordon Henriksen326e24f2007-09-27 19:31:36 +00001205
1206<blockquote><pre
1207>#include "llvm/CodeGen/AsmPrinter.h"
1208#include "llvm/Function.h"
Gordon Henriksenad93c4f2007-12-11 00:30:17 +00001209#include "llvm/Target/TargetMachine.h"
1210#include "llvm/Target/TargetData.h"
Gordon Henriksen326e24f2007-09-27 19:31:36 +00001211#include "llvm/Target/TargetAsmInfo.h"
1212
Gordon Henriksen01571ef2008-08-24 03:18:23 +00001213void MyGCPrinter::beginAssembly(std::ostream &amp;OS, AsmPrinter &amp;AP,
Gordon Henriksenad93c4f2007-12-11 00:30:17 +00001214 const TargetAsmInfo &amp;TAI) {
1215 // Nothing to do.
1216}
1217
Gordon Henriksen01571ef2008-08-24 03:18:23 +00001218void MyGCPrinter::finishAssembly(std::ostream &amp;OS, AsmPrinter &amp;AP,
Gordon Henriksenad93c4f2007-12-11 00:30:17 +00001219 const TargetAsmInfo &amp;TAI) {
Gordon Henriksen326e24f2007-09-27 19:31:36 +00001220 // Set up for emitting addresses.
1221 const char *AddressDirective;
1222 int AddressAlignLog;
Gordon Henriksenad93c4f2007-12-11 00:30:17 +00001223 if (AP.TM.getTargetData()->getPointerSize() == sizeof(int32_t)) {
Gordon Henriksen326e24f2007-09-27 19:31:36 +00001224 AddressDirective = TAI.getData32bitsDirective();
1225 AddressAlignLog = 2;
1226 } else {
1227 AddressDirective = TAI.getData64bitsDirective();
1228 AddressAlignLog = 3;
1229 }
1230
1231 // Put this in the data section.
1232 AP.SwitchToDataSection(TAI.getDataSection());
1233
1234 // For each function...
Gordon Henriksenad93c4f2007-12-11 00:30:17 +00001235 for (iterator FI = begin(), FE = end(); FI != FE; ++FI) {
Gordon Henriksen01571ef2008-08-24 03:18:23 +00001236 GCFunctionInfo &amp;MD = **FI;
Gordon Henriksen326e24f2007-09-27 19:31:36 +00001237
1238 // Emit this data structure:
1239 //
1240 // struct {
1241 // int32_t PointCount;
1242 // struct {
1243 // void *SafePointAddress;
1244 // int32_t LiveCount;
1245 // int32_t LiveOffsets[LiveCount];
1246 // } Points[PointCount];
1247 // } __gcmap_&lt;FUNCTIONNAME&gt;;
1248
1249 // Align to address width.
1250 AP.EmitAlignment(AddressAlignLog);
1251
1252 // Emit the symbol by which the stack map can be found.
1253 std::string Symbol;
1254 Symbol += TAI.getGlobalPrefix();
1255 Symbol += "__gcmap_";
1256 Symbol += MD.getFunction().getName();
1257 if (const char *GlobalDirective = TAI.getGlobalDirective())
1258 OS &lt;&lt; GlobalDirective &lt;&lt; Symbol &lt;&lt; "\n";
1259 OS &lt;&lt; TAI.getGlobalPrefix() &lt;&lt; Symbol &lt;&lt; ":\n";
1260
1261 // Emit PointCount.
1262 AP.EmitInt32(MD.size());
1263 AP.EOL("safe point count");
1264
1265 // And each safe point...
Gordon Henriksen01571ef2008-08-24 03:18:23 +00001266 for (GCFunctionInfo::iterator PI = MD.begin(),
Gordon Henriksen326e24f2007-09-27 19:31:36 +00001267 PE = MD.end(); PI != PE; ++PI) {
1268 // Align to address width.
1269 AP.EmitAlignment(AddressAlignLog);
1270
1271 // Emit the address of the safe point.
1272 OS &lt;&lt; AddressDirective
1273 &lt;&lt; TAI.getPrivateGlobalPrefix() &lt;&lt; "label" &lt;&lt; PI-&gt;Num;
1274 AP.EOL("safe point address");
1275
1276 // Emit the stack frame size.
1277 AP.EmitInt32(MD.getFrameSize());
1278 AP.EOL("stack frame size");
1279
1280 // Emit the number of live roots in the function.
1281 AP.EmitInt32(MD.live_size(PI));
1282 AP.EOL("live root count");
1283
1284 // And for each live root...
Gordon Henriksen01571ef2008-08-24 03:18:23 +00001285 for (GCFunctionInfo::live_iterator LI = MD.live_begin(PI),
Gordon Henriksen326e24f2007-09-27 19:31:36 +00001286 LE = MD.live_end(PI);
1287 LI != LE; ++LI) {
1288 // Print its offset within the stack frame.
1289 AP.EmitInt32(LI-&gt;StackOffset);
1290 AP.EOL("stack offset");
1291 }
1292 }
1293 }
1294}
1295</pre></blockquote>
1296
1297</div>
1298
1299
1300<!-- *********************************************************************** -->
1301<div class="doc_section">
Chris Lattner9b2a1842004-05-27 05:52:10 +00001302 <a name="references">References</a>
1303</div>
1304<!-- *********************************************************************** -->
1305
1306<div class="doc_text">
1307
1308<p><a name="appel89">[Appel89]</a> Runtime Tags Aren't Necessary. Andrew
1309W. Appel. Lisp and Symbolic Computation 19(7):703-705, July 1989.</p>
1310
1311<p><a name="goldberg91">[Goldberg91]</a> Tag-free garbage collection for
Gordon Henriksen326e24f2007-09-27 19:31:36 +00001312strongly typed programming languages. Benjamin Goldberg. ACM SIGPLAN
Chris Lattner9b2a1842004-05-27 05:52:10 +00001313PLDI'91.</p>
1314
1315<p><a name="tolmach94">[Tolmach94]</a> Tag-free garbage collection using
Gordon Henriksen326e24f2007-09-27 19:31:36 +00001316explicit type parameters. Andrew Tolmach. Proceedings of the 1994 ACM
Chris Lattner9b2a1842004-05-27 05:52:10 +00001317conference on LISP and functional programming.</p>
1318
Gordon Henriksen326e24f2007-09-27 19:31:36 +00001319<p><a name="henderson02">[Henderson2002]</a> <a
1320href="http://citeseer.ist.psu.edu/henderson02accurate.html">
1321Accurate Garbage Collection in an Uncooperative Environment</a>.
1322Fergus Henderson. International Symposium on Memory Management 2002.</p>
1323
Chris Lattner9b2a1842004-05-27 05:52:10 +00001324</div>
Chris Lattner0d8c2db2004-05-23 21:02:20 +00001325
Gordon Henriksen326e24f2007-09-27 19:31:36 +00001326
Chris Lattner0d8c2db2004-05-23 21:02:20 +00001327<!-- *********************************************************************** -->
1328
1329<hr>
1330<address>
1331 <a href="http://jigsaw.w3.org/css-validator/check/referer"><img
Misha Brukman44408702008-12-11 17:34:48 +00001332 src="http://jigsaw.w3.org/css-validator/images/vcss-blue" alt="Valid CSS"></a>
Chris Lattner0d8c2db2004-05-23 21:02:20 +00001333 <a href="http://validator.w3.org/check/referer"><img
Misha Brukman44408702008-12-11 17:34:48 +00001334 src="http://www.w3.org/Icons/valid-html401-blue" alt="Valid HTML 4.01"></a>
Chris Lattner0d8c2db2004-05-23 21:02:20 +00001335
1336 <a href="mailto:sabre@nondot.org">Chris Lattner</a><br>
Reid Spencer05fe4b02006-03-14 05:39:39 +00001337 <a href="http://llvm.org">LLVM Compiler Infrastructure</a><br>
Chris Lattner0d8c2db2004-05-23 21:02:20 +00001338 Last modified: $Date$
1339</address>
1340
1341</body>
1342</html>