| Chris Lattner | 0d8c2db | 2004-05-23 21:02:20 +0000 | [diff] [blame] | 1 | <!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01//EN" | 
 | 2 |                       "http://www.w3.org/TR/html4/strict.dtd"> | 
 | 3 | <html> | 
 | 4 | <head> | 
 | 5 |   <title>Accurate Garbage Collection with LLVM</title> | 
 | 6 |   <link rel="stylesheet" href="llvm.css" type="text/css"> | 
 | 7 | </head> | 
 | 8 | <body> | 
 | 9 |  | 
 | 10 | <div class="doc_title"> | 
 | 11 |   Accurate Garbage Collection with LLVM | 
 | 12 | </div> | 
 | 13 |  | 
 | 14 | <ol> | 
 | 15 |   <li><a href="#introduction">Introduction</a> | 
 | 16 |     <ul> | 
 | 17 |     <li><a href="#feature">GC features provided and algorithms supported</a></li> | 
 | 18 |     </ul> | 
 | 19 |   </li> | 
 | 20 |  | 
 | 21 |   <li><a href="#interfaces">Interfaces for user programs</a> | 
 | 22 |     <ul> | 
 | 23 |     <li><a href="#roots">Identifying GC roots on the stack: <tt>llvm.gcroot</tt></a></li> | 
| Chris Lattner | 0d8c2db | 2004-05-23 21:02:20 +0000 | [diff] [blame] | 24 |     <li><a href="#allocate">Allocating memory from the GC</a></li> | 
 | 25 |     <li><a href="#barriers">Reading and writing references to the heap</a></li> | 
 | 26 |     <li><a href="#explicit">Explicit invocation of the garbage collector</a></li> | 
 | 27 |     </ul> | 
 | 28 |   </li> | 
 | 29 |  | 
 | 30 |   <li><a href="#gcimpl">Implementing a garbage collector</a> | 
 | 31 |     <ul> | 
 | 32 |     <li><a href="#llvm_gc_readwrite">Implementing <tt>llvm_gc_read</tt> and <tt>llvm_gc_write</tt></a></li> | 
| Chris Lattner | 9b2a184 | 2004-05-27 05:52:10 +0000 | [diff] [blame] | 33 |     <li><a href="#callbacks">Callback functions used to implement the garbage collector</a></li> | 
| Chris Lattner | 0d8c2db | 2004-05-23 21:02:20 +0000 | [diff] [blame] | 34 |     </ul> | 
 | 35 |   </li> | 
| Chris Lattner | 9b2a184 | 2004-05-27 05:52:10 +0000 | [diff] [blame] | 36 |   <li><a href="#gcimpls">GC implementations available</a> | 
 | 37 |     <ul> | 
 | 38 |     <li><a href="#semispace">SemiSpace - A simple copying garbage collector</a></li> | 
 | 39 |   </li> | 
| Chris Lattner | 0d8c2db | 2004-05-23 21:02:20 +0000 | [diff] [blame] | 40 |  | 
 | 41 | <!-- | 
 | 42 |   <li><a href="#codegen">Implementing GC support in a code generator</a></li> | 
 | 43 | --> | 
 | 44 | </ol> | 
 | 45 |  | 
 | 46 | <div class="doc_author"> | 
 | 47 |   <p>Written by <a href="mailto:sabre@nondot.org">Chris Lattner</a></p> | 
 | 48 | </div> | 
 | 49 |  | 
 | 50 | <!-- *********************************************************************** --> | 
 | 51 | <div class="doc_section"> | 
 | 52 |   <a name="introduction">Introduction</a> | 
 | 53 | </div> | 
 | 54 | <!-- *********************************************************************** --> | 
 | 55 |  | 
 | 56 | <div class="doc_text"> | 
 | 57 |  | 
 | 58 | <p>Garbage collection is a widely used technique that frees the programmer from | 
 | 59 | having to know the life-times of heap objects, making software easier to produce | 
 | 60 | and maintain.  Many programming languages rely on garbage collection for | 
 | 61 | automatic memory management.  There are two primary forms of garbage collection: | 
 | 62 | conservative and accurate.</p> | 
 | 63 |  | 
 | 64 | <p>Conservative garbage collection often does not require any special support | 
 | 65 | from either the language or the compiler: it can handle non-type-safe | 
 | 66 | programming languages (such as C/C++) and does not require any special | 
 | 67 | information from the compiler.  The [LINK] Boehm collector is an example of a | 
 | 68 | state-of-the-art conservative collector.</p> | 
 | 69 |  | 
 | 70 | <p>Accurate garbage collection requires the ability to identify all pointers in | 
 | 71 | the program at run-time (which requires that the source-language be type-safe in | 
 | 72 | most cases).  Identifying pointers at run-time requires compiler support to | 
 | 73 | locate all places that hold live pointer variables at run-time, including the | 
 | 74 | <a href="#roots">processor stack and registers</a>.</p> | 
 | 75 |  | 
 | 76 | <p> | 
 | 77 | Conservative garbage collection is attractive because it does not require any | 
 | 78 | special compiler support, but it does have problems.  In particular, because the | 
 | 79 | conservative garbage collector cannot <i>know</i> that a particular word in the | 
 | 80 | machine is a pointer, it cannot move live objects in the heap (preventing the | 
 | 81 | use of compacting and generational GC algorithms) and it can occasionally suffer | 
 | 82 | from memory leaks due to integer values that happen to point to objects in the | 
 | 83 | program.  In addition, some aggressive compiler transformations can break | 
 | 84 | conservative garbage collectors (though these seem rare in practice). | 
 | 85 | </p> | 
 | 86 |  | 
 | 87 | <p> | 
 | 88 | Accurate garbage collectors do not suffer from any of these problems, but they | 
 | 89 | can suffer from degraded scalar optimization of the program.  In particular, | 
 | 90 | because the runtime must be able to identify and update all pointers active in | 
 | 91 | the program, some optimizations are less effective.  In practice, however, the | 
 | 92 | locality and performance benefits of using aggressive garbage allocation | 
 | 93 | techniques dominates any low-level losses. | 
 | 94 | </p> | 
 | 95 |  | 
 | 96 | <p> | 
 | 97 | This document describes the mechanisms and interfaces provided by LLVM to | 
 | 98 | support accurate garbage collection. | 
 | 99 | </p> | 
 | 100 |  | 
 | 101 | </div> | 
 | 102 |  | 
 | 103 | <!-- ======================================================================= --> | 
 | 104 | <div class="doc_subsection"> | 
 | 105 |   <a name="feature">GC features provided and algorithms supported</a> | 
 | 106 | </div> | 
 | 107 |  | 
 | 108 | <div class="doc_text"> | 
 | 109 |  | 
 | 110 | <p> | 
 | 111 | LLVM provides support for a broad class of garbage collection algorithms, | 
 | 112 | including compacting semi-space collectors, mark-sweep collectors, generational | 
 | 113 | collectors, and even reference counting implementations.  It includes support | 
 | 114 | for <a href="#barriers">read and write barriers</a>, and associating <a | 
 | 115 | href="#roots">meta-data with stack objects</a> (used for tagless garbage | 
 | 116 | collection).  All LLVM code generators support garbage collection, including the | 
 | 117 | C backend. | 
 | 118 | </p> | 
 | 119 |  | 
 | 120 | <p> | 
 | 121 | We hope that the primitive support built into LLVM is sufficient to support a | 
 | 122 | broad class of garbage collected languages, including Scheme, ML, scripting | 
 | 123 | languages, Java, C#, etc.  That said, the implemented garbage collectors may | 
 | 124 | need to be extended to support language-specific features such as finalization, | 
 | 125 | weak references, or other features.  As these needs are identified and | 
 | 126 | implemented, they should be added to this specification. | 
 | 127 | </p> | 
 | 128 |  | 
 | 129 | <p> | 
 | 130 | LLVM does not currently support garbage collection of multi-threaded programs or | 
 | 131 | GC-safe points other than function calls, but these will be added in the future | 
 | 132 | as there is interest. | 
 | 133 | </p> | 
 | 134 |  | 
 | 135 | </div> | 
 | 136 |  | 
 | 137 | <!-- *********************************************************************** --> | 
 | 138 | <div class="doc_section"> | 
 | 139 |   <a name="interfaces">Interfaces for user programs</a> | 
 | 140 | </div> | 
 | 141 | <!-- *********************************************************************** --> | 
 | 142 |  | 
 | 143 | <div class="doc_text"> | 
 | 144 |  | 
 | 145 | <p>This section describes the interfaces provided by LLVM and by the garbage | 
 | 146 | collector run-time that should be used by user programs.  As such, this is the | 
 | 147 | interface that front-end authors should generate code for. | 
 | 148 | </p> | 
 | 149 |  | 
 | 150 | </div> | 
 | 151 |  | 
 | 152 | <!-- ======================================================================= --> | 
 | 153 | <div class="doc_subsection"> | 
 | 154 |   <a name="roots">Identifying GC roots on the stack: <tt>llvm.gcroot</tt></a> | 
 | 155 | </div> | 
 | 156 |  | 
 | 157 | <div class="doc_text"> | 
 | 158 |  | 
 | 159 | <div class="doc_code"><tt> | 
 | 160 |   void %llvm.gcroot(<ty>** %ptrloc, <ty2>* %metadata) | 
 | 161 | </tt></div> | 
 | 162 |  | 
 | 163 | <p> | 
 | 164 | The <tt>llvm.gcroot</tt> intrinsic is used to inform LLVM of a pointer variable | 
 | 165 | on the stack.  The first argument contains the address of the variable on the | 
 | 166 | stack, and the second contains a pointer to metadata that should be associated | 
 | 167 | with the pointer (which <b>must</b> be a constant or global value address).  At | 
 | 168 | runtime, the <tt>llvm.gcroot</tt> intrinsic stores a null pointer into the | 
 | 169 | specified location to initialize the pointer.</p> | 
 | 170 |  | 
 | 171 | <p> | 
 | 172 | Consider the following fragment of Java code: | 
 | 173 | </p> | 
 | 174 |  | 
 | 175 | <pre> | 
 | 176 |        { | 
 | 177 |          Object X;   // A null-initialized reference to an object | 
 | 178 |          ... | 
 | 179 |        } | 
 | 180 | </pre> | 
 | 181 |  | 
 | 182 | <p> | 
 | 183 | This block (which may be located in the middle of a function or in a loop nest), | 
 | 184 | could be compiled to this LLVM code: | 
 | 185 | </p> | 
 | 186 |  | 
 | 187 | <pre> | 
 | 188 | Entry: | 
 | 189 |    ;; In the entry block for the function, allocate the | 
 | 190 |    ;; stack space for X, which is an LLVM pointer. | 
 | 191 |    %X = alloca %Object* | 
 | 192 |    ... | 
 | 193 |  | 
 | 194 |    ;; "CodeBlock" is the block corresponding to the start | 
| Reid Spencer | 03d186a | 2004-05-25 08:45:31 +0000 | [diff] [blame] | 195 |    ;;  of the scope above. | 
| Chris Lattner | 0d8c2db | 2004-05-23 21:02:20 +0000 | [diff] [blame] | 196 | CodeBlock: | 
 | 197 |    ;; Initialize the object, telling LLVM that it is now live. | 
 | 198 |    ;; Java has type-tags on objects, so it doesn't need any | 
 | 199 |    ;; metadata. | 
 | 200 |    call void %llvm.gcroot(%Object** %X, sbyte* null) | 
 | 201 |    ... | 
 | 202 |  | 
 | 203 |    ;; As the pointer goes out of scope, store a null value into | 
 | 204 |    ;; it, to indicate that the value is no longer live. | 
 | 205 |    store %Object* null, %Object** %X | 
 | 206 |    ... | 
 | 207 | </pre> | 
 | 208 |  | 
 | 209 | </div> | 
 | 210 |  | 
 | 211 | <!-- ======================================================================= --> | 
 | 212 | <div class="doc_subsection"> | 
| Chris Lattner | 0d8c2db | 2004-05-23 21:02:20 +0000 | [diff] [blame] | 213 |   <a name="allocate">Allocating memory from the GC</a> | 
 | 214 | </div> | 
 | 215 |  | 
 | 216 | <div class="doc_text"> | 
 | 217 |  | 
 | 218 | <div class="doc_code"><tt> | 
 | 219 |   sbyte *%llvm_gc_allocate(unsigned %Size) | 
 | 220 | </tt></div> | 
 | 221 |  | 
 | 222 | <p>The <tt>llvm_gc_allocate</tt> function is a global function defined by the | 
| Chris Lattner | aab3aff | 2004-06-09 03:59:05 +0000 | [diff] [blame^] | 223 | garbage collector implementation to allocate memory.  It returns a | 
| Chris Lattner | 0d8c2db | 2004-05-23 21:02:20 +0000 | [diff] [blame] | 224 | zeroed-out block of memory of the appropriate size.</p> | 
 | 225 |  | 
 | 226 | </div> | 
 | 227 |  | 
 | 228 | <!-- ======================================================================= --> | 
 | 229 | <div class="doc_subsection"> | 
 | 230 |   <a name="barriers">Reading and writing references to the heap</a> | 
 | 231 | </div> | 
 | 232 |  | 
 | 233 | <div class="doc_text"> | 
 | 234 |  | 
 | 235 | <div class="doc_code"><tt> | 
 | 236 |   sbyte *%llvm.gcread(sbyte **)<br> | 
 | 237 |   void %llvm.gcwrite(sbyte*, sbyte**) | 
 | 238 | </tt></div> | 
 | 239 |  | 
 | 240 | <p>Several of the more interesting garbage collectors (e.g., generational | 
 | 241 | collectors) need to be informed when the mutator (the program that needs garbage | 
 | 242 | collection) reads or writes object references into the heap.  In the case of a | 
 | 243 | generational collector, it needs to keep track of which "old" generation objects | 
 | 244 | have references stored into them.  The amount of code that typically needs to be | 
| Chris Lattner | aab3aff | 2004-06-09 03:59:05 +0000 | [diff] [blame^] | 245 | executed is usually quite small (and not on the critical path of any  | 
 | 246 | computation), so the overall performance impact of the inserted code is  | 
 | 247 | tolerable.</p> | 
| Chris Lattner | 0d8c2db | 2004-05-23 21:02:20 +0000 | [diff] [blame] | 248 |  | 
 | 249 | <p>To support garbage collectors that use read or write barriers, LLVM provides | 
 | 250 | the <tt>llvm.gcread</tt> and <tt>llvm.gcwrite</tt> intrinsics.  The first | 
 | 251 | intrinsic has exactly the same semantics as a non-volatile LLVM load and the | 
 | 252 | second has the same semantics as a non-volatile LLVM store.  At code generation | 
 | 253 | time, these intrinsics are replaced with calls into the garbage collector | 
 | 254 | (<tt><a href="#llvm_gc_readwrite">llvm_gc_read</a></tt> and <tt><a | 
 | 255 | href="#llvm_gc_readwrite">llvm_gc_write</a></tt> respectively), which are then | 
 | 256 | inlined into the code. | 
 | 257 | </p> | 
 | 258 |  | 
 | 259 | <p> | 
 | 260 | If you are writing a front-end for a garbage collected language, every load or | 
 | 261 | store of a reference from or to the heap should use these intrinsics instead of | 
 | 262 | normal LLVM loads/stores.</p> | 
 | 263 |  | 
 | 264 | </div> | 
 | 265 |  | 
 | 266 | <!-- ======================================================================= --> | 
 | 267 | <div class="doc_subsection"> | 
 | 268 |   <a name="initialize">Garbage collector startup and initialization</a> | 
 | 269 | </div> | 
 | 270 |  | 
 | 271 | <div class="doc_text"> | 
 | 272 |  | 
 | 273 | <div class="doc_code"><tt> | 
| Chris Lattner | 9b2a184 | 2004-05-27 05:52:10 +0000 | [diff] [blame] | 274 |   void %llvm_gc_initialize(unsigned %InitialHeapSize) | 
| Chris Lattner | 0d8c2db | 2004-05-23 21:02:20 +0000 | [diff] [blame] | 275 | </tt></div> | 
 | 276 |  | 
 | 277 | <p> | 
 | 278 | The <tt>llvm_gc_initialize</tt> function should be called once before any other | 
 | 279 | garbage collection functions are called.  This gives the garbage collector the | 
| Chris Lattner | 9b2a184 | 2004-05-27 05:52:10 +0000 | [diff] [blame] | 280 | chance to initialize itself and allocate the heap spaces.  The initial heap size | 
 | 281 | to allocate should be specified as an argument. | 
| Chris Lattner | 0d8c2db | 2004-05-23 21:02:20 +0000 | [diff] [blame] | 282 | </p> | 
 | 283 |  | 
 | 284 | </div> | 
 | 285 |  | 
 | 286 | <!-- ======================================================================= --> | 
 | 287 | <div class="doc_subsection"> | 
 | 288 |   <a name="explicit">Explicit invocation of the garbage collector</a> | 
 | 289 | </div> | 
 | 290 |  | 
 | 291 | <div class="doc_text"> | 
 | 292 |  | 
 | 293 | <div class="doc_code"><tt> | 
 | 294 |   void %llvm_gc_collect() | 
 | 295 | </tt></div> | 
 | 296 |  | 
 | 297 | <p> | 
 | 298 | The <tt>llvm_gc_collect</tt> function is exported by the garbage collector | 
 | 299 | implementations to provide a full collection, even when the heap is not | 
 | 300 | exhausted.  This can be used by end-user code as a hint, and may be ignored by | 
 | 301 | the garbage collector. | 
 | 302 | </p> | 
 | 303 |  | 
 | 304 | </div> | 
 | 305 |  | 
 | 306 |  | 
 | 307 | <!-- *********************************************************************** --> | 
 | 308 | <div class="doc_section"> | 
 | 309 |   <a name="gcimpl">Implementing a garbage collector</a> | 
 | 310 | </div> | 
 | 311 | <!-- *********************************************************************** --> | 
 | 312 |  | 
 | 313 | <div class="doc_text"> | 
 | 314 |  | 
 | 315 | <p> | 
| Chris Lattner | 9b2a184 | 2004-05-27 05:52:10 +0000 | [diff] [blame] | 316 | Implementing a garbage collector for LLVM is fairly straight-forward.  The LLVM | 
 | 317 | garbage collectors are provided in a form that makes them easy to link into the | 
 | 318 | language-specific runtime that a language front-end would use.  They require | 
 | 319 | functionality from the language-specific runtime to get information about <a | 
 | 320 | href="#gcdescriptors">where pointers are located in heap objects</a>. | 
 | 321 | </p> | 
 | 322 |  | 
 | 323 | <p>The | 
| Chris Lattner | 0d8c2db | 2004-05-23 21:02:20 +0000 | [diff] [blame] | 324 | implementation must include the <a | 
 | 325 | href="#allocate"><tt>llvm_gc_allocate</tt></a> and <a | 
 | 326 | href="#explicit"><tt>llvm_gc_collect</tt></a> functions, and it must implement | 
 | 327 | the <a href="#llvm_gc_readwrite">read/write barrier</a> functions as well.  To | 
 | 328 | do this, it will probably have to <a href="#traceroots">trace through the roots | 
 | 329 | from the stack</a> and understand the <a href="#gcdescriptors">GC descriptors | 
 | 330 | for heap objects</a>.  Luckily, there are some <a href="#gcimpls">example | 
 | 331 | implementations</a> available. | 
 | 332 | </p> | 
 | 333 | </div> | 
 | 334 |  | 
 | 335 |  | 
 | 336 | <!-- ======================================================================= --> | 
 | 337 | <div class="doc_subsection"> | 
 | 338 |   <a name="llvm_gc_readwrite">Implementing <tt>llvm_gc_read</tt> and <tt>llvm_gc_write</tt></a> | 
 | 339 | </div> | 
 | 340 |  | 
 | 341 | <div class="doc_text"> | 
 | 342 |   <div class="doc_code"><tt> | 
 | 343 |     void *llvm_gc_read(void **)<br> | 
 | 344 |     void llvm_gc_write(void*, void**) | 
 | 345 |  </tt></div> | 
 | 346 |  | 
 | 347 | <p> | 
 | 348 | These functions <i>must</i> be implemented in every garbage collector, even if | 
 | 349 | they do not need read/write barriers.  In this case, just load or store the | 
 | 350 | pointer, then return. | 
 | 351 | </p> | 
 | 352 |  | 
 | 353 | <p> | 
 | 354 | If an actual read or write barrier is needed, it should be straight-forward to | 
 | 355 | implement it.  Note that we may add a pointer to the start of the memory object | 
 | 356 | as a parameter in the future, if needed. | 
 | 357 | </p> | 
 | 358 |  | 
 | 359 | </div> | 
 | 360 |  | 
 | 361 | <!-- ======================================================================= --> | 
 | 362 | <div class="doc_subsection"> | 
| Chris Lattner | 9b2a184 | 2004-05-27 05:52:10 +0000 | [diff] [blame] | 363 |   <a name="callbacks">Callback functions used to implement the garbage collector</a></li> | 
 | 364 | </div> | 
 | 365 |  | 
 | 366 | Garbage collector implementations make use of call-back functions that are | 
 | 367 | implemented by other parts of the LLVM system. | 
 | 368 |  | 
 | 369 | <!--_________________________________________________________________________--> | 
 | 370 | <div class="doc_subsubsection"> | 
 | 371 |   <a name="traceroots">Tracing GC pointers from the program stack</a> | 
| Chris Lattner | 0d8c2db | 2004-05-23 21:02:20 +0000 | [diff] [blame] | 372 | </div> | 
 | 373 |  | 
 | 374 | <div class="doc_text"> | 
 | 375 |   <div class="doc_code"><tt> | 
 | 376 |      void llvm_cg_walk_gcroots(void (*FP)(void **Root, void *Meta)); | 
 | 377 |   </tt></div> | 
 | 378 |  | 
 | 379 | <p> | 
 | 380 | The <tt>llvm_cg_walk_gcroots</tt> function is a function provided by the code | 
 | 381 | generator that iterates through all of the GC roots on the stack, calling the | 
 | 382 | specified function pointer with each record.  For each GC root, the address of | 
 | 383 | the pointer and the meta-data (from the <a | 
 | 384 | href="#gcroot"><tt>llvm.gcroot</tt></a> intrinsic) are provided. | 
 | 385 | </p> | 
 | 386 | </div> | 
 | 387 |  | 
| Chris Lattner | 9b2a184 | 2004-05-27 05:52:10 +0000 | [diff] [blame] | 388 | <!--_________________________________________________________________________--> | 
 | 389 | <div class="doc_subsubsection"> | 
 | 390 |   <a name="staticroots">Tracing GC pointers from static roots</a> | 
 | 391 | </div> | 
| Chris Lattner | 0d8c2db | 2004-05-23 21:02:20 +0000 | [diff] [blame] | 392 |  | 
| Chris Lattner | 9b2a184 | 2004-05-27 05:52:10 +0000 | [diff] [blame] | 393 | <div class="doc_text"> | 
 | 394 | TODO | 
 | 395 | </div> | 
 | 396 |  | 
 | 397 |  | 
 | 398 | <!--_________________________________________________________________________--> | 
 | 399 | <div class="doc_subsubsection"> | 
 | 400 |   <a name="gcdescriptors">Tracing GC pointers from heap objects</a> | 
 | 401 | </div> | 
 | 402 |  | 
 | 403 | <div class="doc_text"> | 
 | 404 | <p> | 
 | 405 | The three most common ways to keep track of where pointers live in heap objects | 
 | 406 | are (listed in order of space overhead required):</p> | 
 | 407 |  | 
 | 408 | <ol> | 
 | 409 | <li>In languages with polymorphic objects, pointers from an object header are | 
 | 410 | usually used to identify the GC pointers in the heap object.  This is common for | 
 | 411 | object-oriented languages like Self, Smalltalk, Java, or C#.</li> | 
 | 412 |  | 
 | 413 | <li>If heap objects are not polymorphic, often the "shape" of the heap can be | 
 | 414 | determined from the roots of the heap or from some other meta-data [<a | 
 | 415 | href="#appel89">Appel89</a>, <a href="#goldberg91">Goldberg91</a>, <a | 
 | 416 | href="#tolmach94">Tolmach94</a>].  In this case, the garbage collector can | 
 | 417 | propagate the information around from meta data stored with the roots.  This | 
 | 418 | often eliminates the need to have a header on objects in the heap.  This is | 
 | 419 | common in the ML family.</li> | 
 | 420 |  | 
 | 421 | <li>If all heap objects have pointers in the same locations, or pointers can be | 
 | 422 | distinguished just by looking at them (e.g., the low order bit is clear), no | 
 | 423 | book-keeping is needed at all.  This is common for Lisp-like languages.</li> | 
 | 424 | </ol> | 
 | 425 |  | 
 | 426 | <p>The LLVM garbage collectors are capable of supporting all of these styles of | 
 | 427 | language, including ones that mix various implementations.  To do this, it | 
 | 428 | allows the source-language to associate meta-data with the <a | 
 | 429 | href="#roots">stack roots</a>, and the heap tracing routines can propagate the | 
 | 430 | information.  In addition, LLVM allows the front-end to extract GC information | 
 | 431 | from in any form from a specific object pointer (this supports situations #1 and | 
 | 432 | #3). | 
 | 433 | </p> | 
 | 434 |  | 
 | 435 | <p><b>Making this efficient</b></p> | 
 | 436 |  | 
 | 437 |  | 
 | 438 |  | 
 | 439 | </div> | 
 | 440 |  | 
 | 441 |  | 
 | 442 |  | 
 | 443 | <!-- *********************************************************************** --> | 
 | 444 | <div class="doc_section"> | 
| Chris Lattner | 0d8c2db | 2004-05-23 21:02:20 +0000 | [diff] [blame] | 445 |   <a name="gcimpls">GC implementations available</a> | 
 | 446 | </div> | 
| Chris Lattner | 9b2a184 | 2004-05-27 05:52:10 +0000 | [diff] [blame] | 447 | <!-- *********************************************************************** --> | 
| Chris Lattner | 0d8c2db | 2004-05-23 21:02:20 +0000 | [diff] [blame] | 448 |  | 
 | 449 | <div class="doc_text"> | 
 | 450 |  | 
 | 451 | <p> | 
 | 452 | To make this more concrete, the currently implemented LLVM garbage collectors | 
| Chris Lattner | 9b2a184 | 2004-05-27 05:52:10 +0000 | [diff] [blame] | 453 | all live in the <tt>llvm/runtime/GC/*</tt> directories in the LLVM source-base. | 
 | 454 | If you are interested in implementing an algorithm, there are many interesting | 
 | 455 | possibilities (mark/sweep, a generational collector, a reference counting | 
 | 456 | collector, etc), or you could choose to improve one of the existing algorithms. | 
| Chris Lattner | 0d8c2db | 2004-05-23 21:02:20 +0000 | [diff] [blame] | 457 | </p> | 
 | 458 |  | 
 | 459 | </div> | 
 | 460 |  | 
| Chris Lattner | 9b2a184 | 2004-05-27 05:52:10 +0000 | [diff] [blame] | 461 | <!-- ======================================================================= --> | 
 | 462 | <div class="doc_subsection"> | 
 | 463 |   <a name="semispace">SemiSpace - A simple copying garbage collector</a></li> | 
 | 464 | </div> | 
 | 465 |  | 
 | 466 | <div class="doc_text"> | 
 | 467 | <p> | 
 | 468 | SemiSpace is a very simple copying collector.  When it starts up, it allocates | 
 | 469 | two blocks of memory for the heap.  It uses a simple bump-pointer allocator to | 
 | 470 | allocate memory from the first block until it runs out of space.  When it runs | 
 | 471 | out of space, it traces through all of the roots of the program, copying blocks | 
 | 472 | to the other half of the memory space. | 
 | 473 | </p> | 
 | 474 |  | 
 | 475 | </div> | 
 | 476 |  | 
 | 477 | <!--_________________________________________________________________________--> | 
 | 478 | <div class="doc_subsubsection"> | 
 | 479 |   Possible Improvements | 
 | 480 | </div> | 
 | 481 |  | 
 | 482 | <div class="doc_text"> | 
 | 483 |  | 
 | 484 | <p> | 
 | 485 | If a collection cycle happens and the heap is not compacted very much (say less | 
 | 486 | than 25% of the allocated memory was freed), the memory regions should be | 
 | 487 | doubled in size.</p> | 
 | 488 |  | 
 | 489 | </div> | 
 | 490 |  | 
 | 491 | <!-- *********************************************************************** --> | 
 | 492 | <div class="doc_section"> | 
 | 493 |   <a name="references">References</a> | 
 | 494 | </div> | 
 | 495 | <!-- *********************************************************************** --> | 
 | 496 |  | 
 | 497 | <div class="doc_text"> | 
 | 498 |  | 
 | 499 | <p><a name="appel89">[Appel89]</a> Runtime Tags Aren't Necessary. Andrew | 
 | 500 | W. Appel. Lisp and Symbolic Computation 19(7):703-705, July 1989.</p> | 
 | 501 |  | 
 | 502 | <p><a name="goldberg91">[Goldberg91]</a> Tag-free garbage collection for | 
 | 503 | strongly typed programming languages.  Benjamin Goldberg. ACM SIGPLAN | 
 | 504 | PLDI'91.</p> | 
 | 505 |  | 
 | 506 | <p><a name="tolmach94">[Tolmach94]</a> Tag-free garbage collection using | 
 | 507 | explicit type parameters.  Andrew Tolmach.  Proceedings of the 1994 ACM | 
 | 508 | conference on LISP and functional programming.</p> | 
 | 509 |  | 
 | 510 | </div> | 
| Chris Lattner | 0d8c2db | 2004-05-23 21:02:20 +0000 | [diff] [blame] | 511 |  | 
 | 512 | <!-- *********************************************************************** --> | 
 | 513 |  | 
 | 514 | <hr> | 
 | 515 | <address> | 
 | 516 |   <a href="http://jigsaw.w3.org/css-validator/check/referer"><img | 
 | 517 |   src="http://jigsaw.w3.org/css-validator/images/vcss" alt="Valid CSS!"></a> | 
 | 518 |   <a href="http://validator.w3.org/check/referer"><img | 
 | 519 |   src="http://www.w3.org/Icons/valid-html401" alt="Valid HTML 4.01!"></a> | 
 | 520 |  | 
 | 521 |   <a href="mailto:sabre@nondot.org">Chris Lattner</a><br> | 
 | 522 |   <a href="http://llvm.cs.uiuc.edu">LLVM Compiler Infrastructure</a><br> | 
 | 523 |   Last modified: $Date$ | 
 | 524 | </address> | 
 | 525 |  | 
 | 526 | </body> | 
 | 527 | </html> |