Philip Reames | f612322 | 2014-12-02 19:37:00 +0000 | [diff] [blame] | 1 | ===================================== |
| 2 | Garbage Collection Safepoints in LLVM |
| 3 | ===================================== |
| 4 | |
| 5 | .. contents:: |
| 6 | :local: |
| 7 | :depth: 2 |
| 8 | |
| 9 | Status |
| 10 | ======= |
| 11 | |
Philip Reames | 0d98ada | 2017-04-19 23:16:13 +0000 | [diff] [blame] | 12 | This document describes a set of extensions to LLVM to support garbage |
| 13 | collection. By now, these mechanisms are well proven with commercial java |
| 14 | implementation with a fully relocating collector having shipped using them. |
| 15 | There are a couple places where bugs might still linger; these are called out |
| 16 | below. |
Philip Reames | f612322 | 2014-12-02 19:37:00 +0000 | [diff] [blame] | 17 | |
Philip Reames | 0d98ada | 2017-04-19 23:16:13 +0000 | [diff] [blame] | 18 | They are still listed as "experimental" to indicate that no forward or backward |
| 19 | compatibility guarantees are offered across versions. If your use case is such |
| 20 | that you need some form of forward compatibility guarantee, please raise the |
| 21 | issue on the llvm-dev mailing list. |
| 22 | |
| 23 | LLVM still supports an alternate mechanism for conservative garbage collection |
| 24 | support using the ``gcroot`` intrinsic. The ``gcroot`` mechanism is mostly of |
Sanjoy Das | 25e71d8 | 2017-04-19 23:55:03 +0000 | [diff] [blame] | 25 | historical interest at this point with one exception - its implementation of |
Philip Reames | 0d98ada | 2017-04-19 23:16:13 +0000 | [diff] [blame] | 26 | shadow stacks has been used successfully by a number of language frontends and |
| 27 | is still supported. |
Philip Reames | f612322 | 2014-12-02 19:37:00 +0000 | [diff] [blame] | 28 | |
| 29 | Overview |
| 30 | ======== |
| 31 | |
Philip Reames | dfc238b | 2015-01-02 19:46:49 +0000 | [diff] [blame] | 32 | To collect dead objects, garbage collectors must be able to identify |
| 33 | any references to objects contained within executing code, and, |
| 34 | depending on the collector, potentially update them. The collector |
| 35 | does not need this information at all points in code - that would make |
| 36 | the problem much harder - but only at well-defined points in the |
| 37 | execution known as 'safepoints' For most collectors, it is sufficient |
| 38 | to track at least one copy of each unique pointer value. However, for |
| 39 | a collector which wishes to relocate objects directly reachable from |
| 40 | running code, a higher standard is required. |
Philip Reames | f612322 | 2014-12-02 19:37:00 +0000 | [diff] [blame] | 41 | |
Philip Reames | dfc238b | 2015-01-02 19:46:49 +0000 | [diff] [blame] | 42 | One additional challenge is that the compiler may compute intermediate |
| 43 | results ("derived pointers") which point outside of the allocation or |
| 44 | even into the middle of another allocation. The eventual use of this |
| 45 | intermediate value must yield an address within the bounds of the |
| 46 | allocation, but such "exterior derived pointers" may be visible to the |
| 47 | collector. Given this, a garbage collector can not safely rely on the |
| 48 | runtime value of an address to indicate the object it is associated |
| 49 | with. If the garbage collector wishes to move any object, the |
| 50 | compiler must provide a mapping, for each pointer, to an indication of |
| 51 | its allocation. |
Philip Reames | f612322 | 2014-12-02 19:37:00 +0000 | [diff] [blame] | 52 | |
Philip Reames | dfc238b | 2015-01-02 19:46:49 +0000 | [diff] [blame] | 53 | To simplify the interaction between a collector and the compiled code, |
| 54 | most garbage collectors are organized in terms of three abstractions: |
| 55 | load barriers, store barriers, and safepoints. |
Philip Reames | f612322 | 2014-12-02 19:37:00 +0000 | [diff] [blame] | 56 | |
Philip Reames | dfc238b | 2015-01-02 19:46:49 +0000 | [diff] [blame] | 57 | #. A load barrier is a bit of code executed immediately after the |
| 58 | machine load instruction, but before any use of the value loaded. |
| 59 | Depending on the collector, such a barrier may be needed for all |
| 60 | loads, merely loads of a particular type (in the original source |
| 61 | language), or none at all. |
Philip Reames | f612322 | 2014-12-02 19:37:00 +0000 | [diff] [blame] | 62 | |
Bruce Mitchener | e9ffb45 | 2015-09-12 01:17:08 +0000 | [diff] [blame] | 63 | #. Analogously, a store barrier is a code fragment that runs |
Philip Reames | dfc238b | 2015-01-02 19:46:49 +0000 | [diff] [blame] | 64 | immediately before the machine store instruction, but after the |
| 65 | computation of the value stored. The most common use of a store |
| 66 | barrier is to update a 'card table' in a generational garbage |
| 67 | collector. |
Philip Reames | f612322 | 2014-12-02 19:37:00 +0000 | [diff] [blame] | 68 | |
Philip Reames | dfc238b | 2015-01-02 19:46:49 +0000 | [diff] [blame] | 69 | #. A safepoint is a location at which pointers visible to the compiled |
| 70 | code (i.e. currently in registers or on the stack) are allowed to |
| 71 | change. After the safepoint completes, the actual pointer value |
| 72 | may differ, but the 'object' (as seen by the source language) |
| 73 | pointed to will not. |
Philip Reames | f612322 | 2014-12-02 19:37:00 +0000 | [diff] [blame] | 74 | |
Philip Reames | dfc238b | 2015-01-02 19:46:49 +0000 | [diff] [blame] | 75 | Note that the term 'safepoint' is somewhat overloaded. It refers to |
| 76 | both the location at which the machine state is parsable and the |
| 77 | coordination protocol involved in bring application threads to a |
| 78 | point at which the collector can safely use that information. The |
| 79 | term "statepoint" as used in this document refers exclusively to the |
| 80 | former. |
Philip Reames | f612322 | 2014-12-02 19:37:00 +0000 | [diff] [blame] | 81 | |
Philip Reames | dfc238b | 2015-01-02 19:46:49 +0000 | [diff] [blame] | 82 | This document focuses on the last item - compiler support for |
| 83 | safepoints in generated code. We will assume that an outside |
| 84 | mechanism has decided where to place safepoints. From our |
| 85 | perspective, all safepoints will be function calls. To support |
| 86 | relocation of objects directly reachable from values in compiled code, |
| 87 | the collector must be able to: |
| 88 | |
| 89 | #. identify every copy of a pointer (including copies introduced by |
| 90 | the compiler itself) at the safepoint, |
Philip Reames | f612322 | 2014-12-02 19:37:00 +0000 | [diff] [blame] | 91 | #. identify which object each pointer relates to, and |
| 92 | #. potentially update each of those copies. |
| 93 | |
Philip Reames | dfc238b | 2015-01-02 19:46:49 +0000 | [diff] [blame] | 94 | This document describes the mechanism by which an LLVM based compiler |
| 95 | can provide this information to a language runtime/collector, and |
Philip Reames | 0d98ada | 2017-04-19 23:16:13 +0000 | [diff] [blame] | 96 | ensure that all pointers can be read and updated if desired. |
| 97 | |
| 98 | At a high level, LLVM has been extended to support compiling to an abstract |
| 99 | machine which extends the actual target with a non-integral pointer type |
| 100 | suitable for representing a garbage collected reference to an object. In |
| 101 | particular, such non-integral pointer type have no defined mapping to an |
| 102 | integer representation. This semantic quirk allows the runtime to pick a |
| 103 | integer mapping for each point in the program allowing relocations of objects |
| 104 | without visible effects. |
| 105 | |
| 106 | Warning: Non-Integral Pointer Types are a newly added concept in LLVM IR. |
| 107 | It's possible that we've missed disabling some of the optimizations which |
| 108 | assume an integral value for pointers. If you find such a case, please |
| 109 | file a bug or share a patch. |
| 110 | |
| 111 | Warning: There is one currently known semantic hole in the definition of |
| 112 | non-integral pointers which has not been addressed upstream. To work around |
| 113 | this, you need to disable speculation of loads unless the memory type |
| 114 | (non-integral pointer vs anything else) is known to unchanged. That is, it is |
| 115 | not safe to speculate a load if doing causes a non-integral pointer value to |
| 116 | be loaded as any other type or vice versa. In practice, this restriction is |
| 117 | well isolated to isSafeToSpeculate in ValueTracking.cpp. |
| 118 | |
| 119 | This high level abstract machine model is used for most of the LLVM optimizer. |
| 120 | Before starting code generation, we switch representations to an explicit form. |
| 121 | In theory, a frontend could directly generate this low level explicit form, but |
| 122 | doing so is likely to inhibit optimization. |
| 123 | |
| 124 | The heart of the explicit approach is to construct (or rewrite) the IR in a |
| 125 | manner where the possible updates performed by the garbage collector are |
Philip Reames | dfc238b | 2015-01-02 19:46:49 +0000 | [diff] [blame] | 126 | explicitly visible in the IR. Doing so requires that we: |
Philip Reames | f612322 | 2014-12-02 19:37:00 +0000 | [diff] [blame] | 127 | |
Philip Reames | dfc238b | 2015-01-02 19:46:49 +0000 | [diff] [blame] | 128 | #. create a new SSA value for each potentially relocated pointer, and |
| 129 | ensure that no uses of the original (non relocated) value is |
| 130 | reachable after the safepoint, |
| 131 | #. specify the relocation in a way which is opaque to the compiler to |
| 132 | ensure that the optimizer can not introduce new uses of an |
| 133 | unrelocated value after a statepoint. This prevents the optimizer |
| 134 | from performing unsound optimizations. |
| 135 | #. recording a mapping of live pointers (and the allocation they're |
| 136 | associated with) for each statepoint. |
Philip Reames | f612322 | 2014-12-02 19:37:00 +0000 | [diff] [blame] | 137 | |
Philip Reames | dfc238b | 2015-01-02 19:46:49 +0000 | [diff] [blame] | 138 | At the most abstract level, inserting a safepoint can be thought of as |
| 139 | replacing a call instruction with a call to a multiple return value |
| 140 | function which both calls the original target of the call, returns |
Sanjoy Das | 25e71d8 | 2017-04-19 23:55:03 +0000 | [diff] [blame] | 141 | its result, and returns updated values for any live pointers to |
Philip Reames | dfc238b | 2015-01-02 19:46:49 +0000 | [diff] [blame] | 142 | garbage collected objects. |
Philip Reames | f612322 | 2014-12-02 19:37:00 +0000 | [diff] [blame] | 143 | |
Philip Reames | dfc238b | 2015-01-02 19:46:49 +0000 | [diff] [blame] | 144 | Note that the task of identifying all live pointers to garbage |
| 145 | collected values, transforming the IR to expose a pointer giving the |
| 146 | base object for every such live pointer, and inserting all the |
| 147 | intrinsics correctly is explicitly out of scope for this document. |
Philip Reames | c88d732 | 2015-02-25 01:23:59 +0000 | [diff] [blame] | 148 | The recommended approach is to use the :ref:`utility passes |
| 149 | <statepoint-utilities>` described below. |
Philip Reames | f612322 | 2014-12-02 19:37:00 +0000 | [diff] [blame] | 150 | |
Philip Reames | dfc238b | 2015-01-02 19:46:49 +0000 | [diff] [blame] | 151 | This abstract function call is concretely represented by a sequence of |
Philip Reames | 5017ab5 | 2015-02-26 01:18:21 +0000 | [diff] [blame] | 152 | intrinsic calls known collectively as a "statepoint relocation sequence". |
Philip Reames | f612322 | 2014-12-02 19:37:00 +0000 | [diff] [blame] | 153 | |
| 154 | Let's consider a simple call in LLVM IR: |
Philip Reames | f612322 | 2014-12-02 19:37:00 +0000 | [diff] [blame] | 155 | |
Philip Reames | 5017ab5 | 2015-02-26 01:18:21 +0000 | [diff] [blame] | 156 | .. code-block:: llvm |
Philip Reames | f612322 | 2014-12-02 19:37:00 +0000 | [diff] [blame] | 157 | |
Philip Reames | 5017ab5 | 2015-02-26 01:18:21 +0000 | [diff] [blame] | 158 | define i8 addrspace(1)* @test1(i8 addrspace(1)* %obj) |
| 159 | gc "statepoint-example" { |
| 160 | call void ()* @foo() |
| 161 | ret i8 addrspace(1)* %obj |
| 162 | } |
Philip Reames | f612322 | 2014-12-02 19:37:00 +0000 | [diff] [blame] | 163 | |
Philip Reames | 5017ab5 | 2015-02-26 01:18:21 +0000 | [diff] [blame] | 164 | Depending on our language we may need to allow a safepoint during the execution |
| 165 | of ``foo``. If so, we need to let the collector update local values in the |
| 166 | current frame. If we don't, we'll be accessing a potential invalid reference |
| 167 | once we eventually return from the call. |
| 168 | |
| 169 | In this example, we need to relocate the SSA value ``%obj``. Since we can't |
| 170 | actually change the value in the SSA value ``%obj``, we need to introduce a new |
| 171 | SSA value ``%obj.relocated`` which represents the potentially changed value of |
| 172 | ``%obj`` after the safepoint and update any following uses appropriately. The |
| 173 | resulting relocation sequence is: |
| 174 | |
Nuno Lopes | e02fcee | 2017-07-26 14:11:23 +0000 | [diff] [blame] | 175 | .. code-block:: llvm |
Philip Reames | 5017ab5 | 2015-02-26 01:18:21 +0000 | [diff] [blame] | 176 | |
| 177 | define i8 addrspace(1)* @test1(i8 addrspace(1)* %obj) |
| 178 | gc "statepoint-example" { |
Chen Li | d71999e | 2015-12-26 07:54:32 +0000 | [diff] [blame] | 179 | %0 = call token (i64, i32, void ()*, i32, i32, ...)* @llvm.experimental.gc.statepoint.p0f_isVoidf(i64 0, i32 0, void ()* @foo, i32 0, i32 0, i32 0, i32 0, i8 addrspace(1)* %obj) |
| 180 | %obj.relocated = call coldcc i8 addrspace(1)* @llvm.experimental.gc.relocate.p1i8(token %0, i32 7, i32 7) |
Philip Reames | 5017ab5 | 2015-02-26 01:18:21 +0000 | [diff] [blame] | 181 | ret i8 addrspace(1)* %obj.relocated |
| 182 | } |
Philip Reames | f612322 | 2014-12-02 19:37:00 +0000 | [diff] [blame] | 183 | |
Philip Reames | dfc238b | 2015-01-02 19:46:49 +0000 | [diff] [blame] | 184 | Ideally, this sequence would have been represented as a M argument, N |
| 185 | return value function (where M is the number of values being |
| 186 | relocated + the original call arguments and N is the original return |
| 187 | value + each relocated value), but LLVM does not easily support such a |
| 188 | representation. |
| 189 | |
| 190 | Instead, the statepoint intrinsic marks the actual site of the |
| 191 | safepoint or statepoint. The statepoint returns a token value (which |
| 192 | exists only at compile time). To get back the original return value |
Philip Reames | 5017ab5 | 2015-02-26 01:18:21 +0000 | [diff] [blame] | 193 | of the call, we use the ``gc.result`` intrinsic. To get the relocation |
| 194 | of each pointer in turn, we use the ``gc.relocate`` intrinsic with the |
| 195 | appropriate index. Note that both the ``gc.relocate`` and ``gc.result`` are |
| 196 | tied to the statepoint. The combination forms a "statepoint relocation |
Bruce Mitchener | e9ffb45 | 2015-09-12 01:17:08 +0000 | [diff] [blame] | 197 | sequence" and represents the entirety of a parseable call or 'statepoint'. |
Philip Reames | f612322 | 2014-12-02 19:37:00 +0000 | [diff] [blame] | 198 | |
Philip Reames | 5017ab5 | 2015-02-26 01:18:21 +0000 | [diff] [blame] | 199 | When lowered, this example would generate the following x86 assembly: |
| 200 | |
| 201 | .. code-block:: gas |
| 202 | |
| 203 | .globl test1 |
| 204 | .align 16, 0x90 |
| 205 | pushq %rax |
| 206 | callq foo |
| 207 | .Ltmp1: |
| 208 | movq (%rsp), %rax # This load is redundant (oops!) |
| 209 | popq %rdx |
| 210 | retq |
Philip Reames | f612322 | 2014-12-02 19:37:00 +0000 | [diff] [blame] | 211 | |
Philip Reames | dfc238b | 2015-01-02 19:46:49 +0000 | [diff] [blame] | 212 | Each of the potentially relocated values has been spilled to the |
| 213 | stack, and a record of that location has been recorded to the |
Philip Reames | 5017ab5 | 2015-02-26 01:18:21 +0000 | [diff] [blame] | 214 | :ref:`Stack Map section <stackmap-section>`. If the garbage collector |
Philip Reames | dfc238b | 2015-01-02 19:46:49 +0000 | [diff] [blame] | 215 | needs to update any of these pointers during the call, it knows |
| 216 | exactly what to change. |
Philip Reames | f612322 | 2014-12-02 19:37:00 +0000 | [diff] [blame] | 217 | |
Philip Reames | 5017ab5 | 2015-02-26 01:18:21 +0000 | [diff] [blame] | 218 | The relevant parts of the StackMap section for our example are: |
| 219 | |
| 220 | .. code-block:: gas |
| 221 | |
| 222 | # This describes the call site |
| 223 | # Stack Maps: callsite 2882400000 |
| 224 | .quad 2882400000 |
| 225 | .long .Ltmp1-test1 |
| 226 | .short 0 |
| 227 | # .. 8 entries skipped .. |
| 228 | # This entry describes the spill slot which is directly addressable |
| 229 | # off RSP with offset 0. Given the value was spilled with a pushq, |
| 230 | # that makes sense. |
| 231 | # Stack Maps: Loc 8: Direct RSP [encoding: .byte 2, .byte 8, .short 7, .int 0] |
| 232 | .byte 2 |
| 233 | .byte 8 |
| 234 | .short 7 |
| 235 | .long 0 |
| 236 | |
Sanjoy Das | 25e71d8 | 2017-04-19 23:55:03 +0000 | [diff] [blame] | 237 | This example was taken from the tests for the :ref:`RewriteStatepointsForGC` |
| 238 | utility pass. As such, its full StackMap can be easily examined with the |
| 239 | following command. |
Philip Reames | 5017ab5 | 2015-02-26 01:18:21 +0000 | [diff] [blame] | 240 | |
| 241 | .. code-block:: bash |
| 242 | |
| 243 | opt -rewrite-statepoints-for-gc test/Transforms/RewriteStatepointsForGC/basics.ll -S | llc -debug-only=stackmaps |
| 244 | |
Philip Reames | c9e5444 | 2015-08-26 17:25:36 +0000 | [diff] [blame] | 245 | Base & Derived Pointers |
| 246 | ^^^^^^^^^^^^^^^^^^^^^^^ |
| 247 | |
Philip Reames | ca22b86 | 2015-08-26 23:13:35 +0000 | [diff] [blame] | 248 | A "base pointer" is one which points to the starting address of an allocation |
| 249 | (object). A "derived pointer" is one which is offset from a base pointer by |
| 250 | some amount. When relocating objects, a garbage collector needs to be able |
| 251 | to relocate each derived pointer associated with an allocation to the same |
| 252 | offset from the new address. |
Philip Reames | c9e5444 | 2015-08-26 17:25:36 +0000 | [diff] [blame] | 253 | |
Philip Reames | ca22b86 | 2015-08-26 23:13:35 +0000 | [diff] [blame] | 254 | "Interior derived pointers" remain within the bounds of the allocation |
| 255 | they're associated with. As a result, the base object can be found at |
| 256 | runtime provided the bounds of allocations are known to the runtime system. |
| 257 | |
| 258 | "Exterior derived pointers" are outside the bounds of the associated object; |
| 259 | they may even fall within *another* allocations address range. As a result, |
| 260 | there is no way for a garbage collector to determine which allocation they |
| 261 | are associated with at runtime and compiler support is needed. |
| 262 | |
| 263 | The ``gc.relocate`` intrinsic supports an explicit operand for describing the |
| 264 | allocation associated with a derived pointer. This operand is frequently |
| 265 | referred to as the base operand, but does not strictly speaking have to be |
| 266 | a base pointer, but it does need to lie within the bounds of the associated |
| 267 | allocation. Some collectors may require that the operand be an actual base |
| 268 | pointer rather than merely an internal derived pointer. Note that during |
| 269 | lowering both the base and derived pointer operands are required to be live |
| 270 | over the associated call safepoint even if the base is otherwise unused |
| 271 | afterwards. |
| 272 | |
| 273 | If we extend our previous example to include a pointless derived pointer, |
| 274 | we get: |
| 275 | |
Nuno Lopes | e02fcee | 2017-07-26 14:11:23 +0000 | [diff] [blame] | 276 | .. code-block:: llvm |
Philip Reames | ca22b86 | 2015-08-26 23:13:35 +0000 | [diff] [blame] | 277 | |
| 278 | define i8 addrspace(1)* @test1(i8 addrspace(1)* %obj) |
| 279 | gc "statepoint-example" { |
| 280 | %gep = getelementptr i8, i8 addrspace(1)* %obj, i64 20000 |
Chen Li | d71999e | 2015-12-26 07:54:32 +0000 | [diff] [blame] | 281 | %token = call token (i64, i32, void ()*, i32, i32, ...)* @llvm.experimental.gc.statepoint.p0f_isVoidf(i64 0, i32 0, void ()* @foo, i32 0, i32 0, i32 0, i32 0, i8 addrspace(1)* %obj, i8 addrspace(1)* %gep) |
| 282 | %obj.relocated = call i8 addrspace(1)* @llvm.experimental.gc.relocate.p1i8(token %token, i32 7, i32 7) |
| 283 | %gep.relocated = call i8 addrspace(1)* @llvm.experimental.gc.relocate.p1i8(token %token, i32 7, i32 8) |
Philip Reames | ca22b86 | 2015-08-26 23:13:35 +0000 | [diff] [blame] | 284 | %p = getelementptr i8, i8 addrspace(1)* %gep, i64 -20000 |
| 285 | ret i8 addrspace(1)* %p |
| 286 | } |
| 287 | |
| 288 | Note that in this example %p and %obj.relocate are the same address and we |
| 289 | could replace one with the other, potentially removing the derived pointer |
Sanjoy Das | a34ce95 | 2016-01-20 19:50:25 +0000 | [diff] [blame] | 290 | from the live set at the safepoint entirely. |
| 291 | |
| 292 | .. _gc_transition_args: |
Philip Reames | 5017ab5 | 2015-02-26 01:18:21 +0000 | [diff] [blame] | 293 | |
Pat Gavlin | cc0431d | 2015-05-08 18:07:42 +0000 | [diff] [blame] | 294 | GC Transitions |
| 295 | ^^^^^^^^^^^^^^^^^^ |
Philip Reames | 5017ab5 | 2015-02-26 01:18:21 +0000 | [diff] [blame] | 296 | |
Pat Gavlin | cc0431d | 2015-05-08 18:07:42 +0000 | [diff] [blame] | 297 | As a practical consideration, many garbage-collected systems allow code that is |
| 298 | collector-aware ("managed code") to call code that is not collector-aware |
| 299 | ("unmanaged code"). It is common that such calls must also be safepoints, since |
| 300 | it is desirable to allow the collector to run during the execution of |
Sylvestre Ledru | 84666a1 | 2016-02-14 20:16:22 +0000 | [diff] [blame] | 301 | unmanaged code. Furthermore, it is common that coordinating the transition from |
Pat Gavlin | cc0431d | 2015-05-08 18:07:42 +0000 | [diff] [blame] | 302 | managed to unmanaged code requires extra code generation at the call site to |
| 303 | inform the collector of the transition. In order to support these needs, a |
| 304 | statepoint may be marked as a GC transition, and data that is necessary to |
| 305 | perform the transition (if any) may be provided as additional arguments to the |
| 306 | statepoint. |
| 307 | |
| 308 | Note that although in many cases statepoints may be inferred to be GC |
| 309 | transitions based on the function symbols involved (e.g. a call from a |
| 310 | function with GC strategy "foo" to a function with GC strategy "bar"), |
| 311 | indirect calls that are also GC transitions must also be supported. This |
Bruce Mitchener | e9ffb45 | 2015-09-12 01:17:08 +0000 | [diff] [blame] | 312 | requirement is the driving force behind the decision to require that GC |
Pat Gavlin | cc0431d | 2015-05-08 18:07:42 +0000 | [diff] [blame] | 313 | transitions are explicitly marked. |
| 314 | |
| 315 | Let's revisit the sample given above, this time treating the call to ``@foo`` |
| 316 | as a GC transition. Depending on our target, the transition code may need to |
| 317 | access some extra state in order to inform the collector of the transition. |
| 318 | Let's assume a hypothetical GC--somewhat unimaginatively named "hypothetical-gc" |
| 319 | --that requires that a TLS variable must be written to before and after a call |
| 320 | to unmanaged code. The resulting relocation sequence is: |
| 321 | |
Nuno Lopes | e02fcee | 2017-07-26 14:11:23 +0000 | [diff] [blame] | 322 | .. code-block:: llvm |
Pat Gavlin | cc0431d | 2015-05-08 18:07:42 +0000 | [diff] [blame] | 323 | |
| 324 | @flag = thread_local global i32 0, align 4 |
| 325 | |
| 326 | define i8 addrspace(1)* @test1(i8 addrspace(1) *%obj) |
| 327 | gc "hypothetical-gc" { |
| 328 | |
Chen Li | d71999e | 2015-12-26 07:54:32 +0000 | [diff] [blame] | 329 | %0 = call token (i64, i32, void ()*, i32, i32, ...)* @llvm.experimental.gc.statepoint.p0f_isVoidf(i64 0, i32 0, void ()* @foo, i32 0, i32 1, i32* @Flag, i32 0, i8 addrspace(1)* %obj) |
| 330 | %obj.relocated = call coldcc i8 addrspace(1)* @llvm.experimental.gc.relocate.p1i8(token %0, i32 7, i32 7) |
Pat Gavlin | cc0431d | 2015-05-08 18:07:42 +0000 | [diff] [blame] | 331 | ret i8 addrspace(1)* %obj.relocated |
| 332 | } |
| 333 | |
| 334 | During lowering, this will result in a instruction selection DAG that looks |
| 335 | something like: |
| 336 | |
Pat Gavlin | 7afaed2 | 2015-05-08 18:37:49 +0000 | [diff] [blame] | 337 | :: |
Pat Gavlin | cc0431d | 2015-05-08 18:07:42 +0000 | [diff] [blame] | 338 | |
| 339 | CALLSEQ_START |
| 340 | ... |
| 341 | GC_TRANSITION_START (lowered i32 *@Flag), SRCVALUE i32* Flag |
| 342 | STATEPOINT |
| 343 | GC_TRANSITION_END (lowered i32 *@Flag), SRCVALUE i32 *Flag |
| 344 | ... |
| 345 | CALLSEQ_END |
| 346 | |
| 347 | In order to generate the necessary transition code, the backend for each target |
| 348 | supported by "hypothetical-gc" must be modified to lower ``GC_TRANSITION_START`` |
| 349 | and ``GC_TRANSITION_END`` nodes appropriately when the "hypothetical-gc" |
| 350 | strategy is in use for a particular function. Assuming that such lowering has |
| 351 | been added for X86, the generated assembly would be: |
| 352 | |
| 353 | .. code-block:: gas |
| 354 | |
| 355 | .globl test1 |
| 356 | .align 16, 0x90 |
| 357 | pushq %rax |
| 358 | movl $1, %fs:Flag@TPOFF |
| 359 | callq foo |
| 360 | movl $0, %fs:Flag@TPOFF |
| 361 | .Ltmp1: |
| 362 | movq (%rsp), %rax # This load is redundant (oops!) |
| 363 | popq %rdx |
| 364 | retq |
| 365 | |
| 366 | Note that the design as presented above is not fully implemented: in particular, |
| 367 | strategy-specific lowering is not present, and all GC transitions are emitted as |
| 368 | as single no-op before and after the call instruction. These no-ops are often |
| 369 | removed by the backend during dead machine instruction elimination. |
Philip Reames | 5017ab5 | 2015-02-26 01:18:21 +0000 | [diff] [blame] | 370 | |
| 371 | |
Philip Reames | f612322 | 2014-12-02 19:37:00 +0000 | [diff] [blame] | 372 | Intrinsics |
| 373 | =========== |
| 374 | |
Philip Reames | c012728 | 2015-02-24 23:57:26 +0000 | [diff] [blame] | 375 | 'llvm.experimental.gc.statepoint' Intrinsic |
| 376 | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ |
Philip Reames | f612322 | 2014-12-02 19:37:00 +0000 | [diff] [blame] | 377 | |
| 378 | Syntax: |
| 379 | """"""" |
| 380 | |
| 381 | :: |
| 382 | |
Chen Li | d71999e | 2015-12-26 07:54:32 +0000 | [diff] [blame] | 383 | declare token |
Sanjoy Das | a1d39ba | 2015-05-12 23:52:24 +0000 | [diff] [blame] | 384 | @llvm.experimental.gc.statepoint(i64 <id>, i32 <num patch bytes>, |
| 385 | func_type <target>, |
Sanjoy Das | dc4932f | 2015-05-13 20:19:51 +0000 | [diff] [blame] | 386 | i64 <#call args>, i64 <flags>, |
Philip Reames | c012728 | 2015-02-24 23:57:26 +0000 | [diff] [blame] | 387 | ... (call parameters), |
Pat Gavlin | cc0431d | 2015-05-08 18:07:42 +0000 | [diff] [blame] | 388 | i64 <# transition args>, ... (transition parameters), |
Philip Reames | f612322 | 2014-12-02 19:37:00 +0000 | [diff] [blame] | 389 | i64 <# deopt args>, ... (deopt parameters), |
| 390 | ... (gc parameters)) |
| 391 | |
| 392 | Overview: |
| 393 | """"""""" |
| 394 | |
Philip Reames | dfc238b | 2015-01-02 19:46:49 +0000 | [diff] [blame] | 395 | The statepoint intrinsic represents a call which is parse-able by the |
| 396 | runtime. |
Philip Reames | f612322 | 2014-12-02 19:37:00 +0000 | [diff] [blame] | 397 | |
| 398 | Operands: |
| 399 | """"""""" |
| 400 | |
Sanjoy Das | a1d39ba | 2015-05-12 23:52:24 +0000 | [diff] [blame] | 401 | The 'id' operand is a constant integer that is reported as the ID |
| 402 | field in the generated stackmap. LLVM does not interpret this |
| 403 | parameter in any way and its meaning is up to the statepoint user to |
| 404 | decide. Note that LLVM is free to duplicate code containing |
| 405 | statepoint calls, and this may transform IR that had a unique 'id' per |
| 406 | lexical call to statepoint to IR that does not. |
| 407 | |
| 408 | If 'num patch bytes' is non-zero then the call instruction |
| 409 | corresponding to the statepoint is not emitted and LLVM emits 'num |
| 410 | patch bytes' bytes of nops in its place. LLVM will emit code to |
| 411 | prepare the function arguments and retrieve the function return value |
| 412 | in accordance to the calling convention; the former before the nop |
| 413 | sequence and the latter after the nop sequence. It is expected that |
| 414 | the user will patch over the 'num patch bytes' bytes of nops with a |
| 415 | calling sequence specific to their runtime before executing the |
| 416 | generated machine code. There are no guarantees with respect to the |
| 417 | alignment of the nop sequence. Unlike :doc:`StackMaps` statepoints do |
Sanjoy Das | cfe41f0 | 2015-07-28 23:50:30 +0000 | [diff] [blame] | 418 | not have a concept of shadow bytes. Note that semantically the |
| 419 | statepoint still represents a call or invoke to 'target', and the nop |
| 420 | sequence after patching is expected to represent an operation |
| 421 | equivalent to a call or invoke to 'target'. |
Sanjoy Das | a1d39ba | 2015-05-12 23:52:24 +0000 | [diff] [blame] | 422 | |
Philip Reames | dfc238b | 2015-01-02 19:46:49 +0000 | [diff] [blame] | 423 | The 'target' operand is the function actually being called. The |
| 424 | target can be specified as either a symbolic LLVM function, or as an |
| 425 | arbitrary Value of appropriate function type. Note that the function |
| 426 | type must match the signature of the callee and the types of the 'call |
Sanjoy Das | cfe41f0 | 2015-07-28 23:50:30 +0000 | [diff] [blame] | 427 | parameters' arguments. |
Philip Reames | f612322 | 2014-12-02 19:37:00 +0000 | [diff] [blame] | 428 | |
Philip Reames | dfc238b | 2015-01-02 19:46:49 +0000 | [diff] [blame] | 429 | The '#call args' operand is the number of arguments to the actual |
| 430 | call. It must exactly match the number of arguments passed in the |
| 431 | 'call parameters' variable length section. |
Philip Reames | f612322 | 2014-12-02 19:37:00 +0000 | [diff] [blame] | 432 | |
Pat Gavlin | cc0431d | 2015-05-08 18:07:42 +0000 | [diff] [blame] | 433 | The 'flags' operand is used to specify extra information about the |
| 434 | statepoint. This is currently only used to mark certain statepoints |
| 435 | as GC transitions. This operand is a 64-bit integer with the following |
| 436 | layout, where bit 0 is the least significant bit: |
| 437 | |
| 438 | +-------+---------------------------------------------------+ |
| 439 | | Bit # | Usage | |
| 440 | +=======+===================================================+ |
| 441 | | 0 | Set if the statepoint is a GC transition, cleared | |
| 442 | | | otherwise. | |
| 443 | +-------+---------------------------------------------------+ |
| 444 | | 1-63 | Reserved for future use; must be cleared. | |
| 445 | +-------+---------------------------------------------------+ |
Philip Reames | f612322 | 2014-12-02 19:37:00 +0000 | [diff] [blame] | 446 | |
Philip Reames | dfc238b | 2015-01-02 19:46:49 +0000 | [diff] [blame] | 447 | The 'call parameters' arguments are simply the arguments which need to |
| 448 | be passed to the call target. They will be lowered according to the |
| 449 | specified calling convention and otherwise handled like a normal call |
| 450 | instruction. The number of arguments must exactly match what is |
| 451 | specified in '# call args'. The types must match the signature of |
| 452 | 'target'. |
Philip Reames | f612322 | 2014-12-02 19:37:00 +0000 | [diff] [blame] | 453 | |
Pat Gavlin | cc0431d | 2015-05-08 18:07:42 +0000 | [diff] [blame] | 454 | The 'transition parameters' arguments contain an arbitrary list of |
| 455 | Values which need to be passed to GC transition code. They will be |
| 456 | lowered and passed as operands to the appropriate GC_TRANSITION nodes |
| 457 | in the selection DAG. It is assumed that these arguments must be |
| 458 | available before and after (but not necessarily during) the execution |
| 459 | of the callee. The '# transition args' field indicates how many operands |
| 460 | are to be interpreted as 'transition parameters'. |
| 461 | |
Philip Reames | dfc238b | 2015-01-02 19:46:49 +0000 | [diff] [blame] | 462 | The 'deopt parameters' arguments contain an arbitrary list of Values |
| 463 | which is meaningful to the runtime. The runtime may read any of these |
| 464 | values, but is assumed not to modify them. If the garbage collector |
| 465 | might need to modify one of these values, it must also be listed in |
| 466 | the 'gc pointer' argument list. The '# deopt args' field indicates |
| 467 | how many operands are to be interpreted as 'deopt parameters'. |
Philip Reames | f612322 | 2014-12-02 19:37:00 +0000 | [diff] [blame] | 468 | |
Philip Reames | dfc238b | 2015-01-02 19:46:49 +0000 | [diff] [blame] | 469 | The 'gc parameters' arguments contain every pointer to a garbage |
| 470 | collector object which potentially needs to be updated by the garbage |
| 471 | collector. Note that the argument list must explicitly contain a base |
| 472 | pointer for every derived pointer listed. The order of arguments is |
| 473 | unimportant. Unlike the other variable length parameter sets, this |
| 474 | list is not length prefixed. |
Philip Reames | f612322 | 2014-12-02 19:37:00 +0000 | [diff] [blame] | 475 | |
| 476 | Semantics: |
| 477 | """""""""" |
| 478 | |
Philip Reames | dfc238b | 2015-01-02 19:46:49 +0000 | [diff] [blame] | 479 | A statepoint is assumed to read and write all memory. As a result, |
| 480 | memory operations can not be reordered past a statepoint. It is |
| 481 | illegal to mark a statepoint as being either 'readonly' or 'readnone'. |
Philip Reames | f612322 | 2014-12-02 19:37:00 +0000 | [diff] [blame] | 482 | |
Philip Reames | dfc238b | 2015-01-02 19:46:49 +0000 | [diff] [blame] | 483 | Note that legal IR can not perform any memory operation on a 'gc |
| 484 | pointer' argument of the statepoint in a location statically reachable |
| 485 | from the statepoint. Instead, the explicitly relocated value (from a |
Philip Reames | c609a59 | 2015-02-25 00:22:07 +0000 | [diff] [blame] | 486 | ``gc.relocate``) must be used. |
Philip Reames | f612322 | 2014-12-02 19:37:00 +0000 | [diff] [blame] | 487 | |
Philip Reames | c012728 | 2015-02-24 23:57:26 +0000 | [diff] [blame] | 488 | 'llvm.experimental.gc.result' Intrinsic |
| 489 | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ |
Philip Reames | f612322 | 2014-12-02 19:37:00 +0000 | [diff] [blame] | 490 | |
| 491 | Syntax: |
| 492 | """"""" |
| 493 | |
| 494 | :: |
| 495 | |
| 496 | declare type* |
Chen Li | d71999e | 2015-12-26 07:54:32 +0000 | [diff] [blame] | 497 | @llvm.experimental.gc.result(token %statepoint_token) |
Philip Reames | f612322 | 2014-12-02 19:37:00 +0000 | [diff] [blame] | 498 | |
| 499 | Overview: |
| 500 | """"""""" |
| 501 | |
Philip Reames | c609a59 | 2015-02-25 00:22:07 +0000 | [diff] [blame] | 502 | ``gc.result`` extracts the result of the original call instruction |
| 503 | which was replaced by the ``gc.statepoint``. The ``gc.result`` |
Philip Reames | dfc238b | 2015-01-02 19:46:49 +0000 | [diff] [blame] | 504 | intrinsic is actually a family of three intrinsics due to an |
| 505 | implementation limitation. Other than the type of the return value, |
| 506 | the semantics are the same. |
Philip Reames | f612322 | 2014-12-02 19:37:00 +0000 | [diff] [blame] | 507 | |
| 508 | Operands: |
| 509 | """"""""" |
| 510 | |
Philip Reames | c609a59 | 2015-02-25 00:22:07 +0000 | [diff] [blame] | 511 | The first and only argument is the ``gc.statepoint`` which starts |
| 512 | the safepoint sequence of which this ``gc.result`` is a part. |
Chen Li | d71999e | 2015-12-26 07:54:32 +0000 | [diff] [blame] | 513 | Despite the typing of this as a generic token, *only* the value defined |
Philip Reames | c609a59 | 2015-02-25 00:22:07 +0000 | [diff] [blame] | 514 | by a ``gc.statepoint`` is legal here. |
Philip Reames | f612322 | 2014-12-02 19:37:00 +0000 | [diff] [blame] | 515 | |
| 516 | Semantics: |
| 517 | """""""""" |
| 518 | |
Philip Reames | c609a59 | 2015-02-25 00:22:07 +0000 | [diff] [blame] | 519 | The ``gc.result`` represents the return value of the call target of |
| 520 | the ``statepoint``. The type of the ``gc.result`` must exactly match |
Philip Reames | dfc238b | 2015-01-02 19:46:49 +0000 | [diff] [blame] | 521 | the type of the target. If the call target returns void, there will |
Philip Reames | c609a59 | 2015-02-25 00:22:07 +0000 | [diff] [blame] | 522 | be no ``gc.result``. |
Philip Reames | f612322 | 2014-12-02 19:37:00 +0000 | [diff] [blame] | 523 | |
Philip Reames | c609a59 | 2015-02-25 00:22:07 +0000 | [diff] [blame] | 524 | A ``gc.result`` is modeled as a 'readnone' pure function. It has no |
Philip Reames | dfc238b | 2015-01-02 19:46:49 +0000 | [diff] [blame] | 525 | side effects since it is just a projection of the return value of the |
Philip Reames | c609a59 | 2015-02-25 00:22:07 +0000 | [diff] [blame] | 526 | previous call represented by the ``gc.statepoint``. |
Philip Reames | f612322 | 2014-12-02 19:37:00 +0000 | [diff] [blame] | 527 | |
Philip Reames | c012728 | 2015-02-24 23:57:26 +0000 | [diff] [blame] | 528 | 'llvm.experimental.gc.relocate' Intrinsic |
| 529 | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ |
Philip Reames | f612322 | 2014-12-02 19:37:00 +0000 | [diff] [blame] | 530 | |
| 531 | Syntax: |
| 532 | """"""" |
| 533 | |
| 534 | :: |
| 535 | |
Philip Reames | c012728 | 2015-02-24 23:57:26 +0000 | [diff] [blame] | 536 | declare <pointer type> |
Chen Li | d71999e | 2015-12-26 07:54:32 +0000 | [diff] [blame] | 537 | @llvm.experimental.gc.relocate(token %statepoint_token, |
Philip Reames | c012728 | 2015-02-24 23:57:26 +0000 | [diff] [blame] | 538 | i32 %base_offset, |
| 539 | i32 %pointer_offset) |
Philip Reames | f612322 | 2014-12-02 19:37:00 +0000 | [diff] [blame] | 540 | |
| 541 | Overview: |
| 542 | """"""""" |
| 543 | |
Philip Reames | c609a59 | 2015-02-25 00:22:07 +0000 | [diff] [blame] | 544 | A ``gc.relocate`` returns the potentially relocated value of a pointer |
Philip Reames | dfc238b | 2015-01-02 19:46:49 +0000 | [diff] [blame] | 545 | at the safepoint. |
Philip Reames | f612322 | 2014-12-02 19:37:00 +0000 | [diff] [blame] | 546 | |
| 547 | Operands: |
| 548 | """"""""" |
| 549 | |
Philip Reames | c609a59 | 2015-02-25 00:22:07 +0000 | [diff] [blame] | 550 | The first argument is the ``gc.statepoint`` which starts the |
| 551 | safepoint sequence of which this ``gc.relocation`` is a part. |
Chen Li | d71999e | 2015-12-26 07:54:32 +0000 | [diff] [blame] | 552 | Despite the typing of this as a generic token, *only* the value defined |
Philip Reames | c609a59 | 2015-02-25 00:22:07 +0000 | [diff] [blame] | 553 | by a ``gc.statepoint`` is legal here. |
Philip Reames | f612322 | 2014-12-02 19:37:00 +0000 | [diff] [blame] | 554 | |
Philip Reames | dfc238b | 2015-01-02 19:46:49 +0000 | [diff] [blame] | 555 | The second argument is an index into the statepoints list of arguments |
Philip Reames | ca22b86 | 2015-08-26 23:13:35 +0000 | [diff] [blame] | 556 | which specifies the allocation for the pointer being relocated. |
Philip Reames | dfc238b | 2015-01-02 19:46:49 +0000 | [diff] [blame] | 557 | This index must land within the 'gc parameter' section of the |
Philip Reames | ca22b86 | 2015-08-26 23:13:35 +0000 | [diff] [blame] | 558 | statepoint's argument list. The associated value must be within the |
| 559 | object with which the pointer being relocated is associated. The optimizer |
| 560 | is free to change *which* interior derived pointer is reported, provided that |
| 561 | it does not replace an actual base pointer with another interior derived |
| 562 | pointer. Collectors are allowed to rely on the base pointer operand |
| 563 | remaining an actual base pointer if so constructed. |
Philip Reames | f612322 | 2014-12-02 19:37:00 +0000 | [diff] [blame] | 564 | |
Philip Reames | dfc238b | 2015-01-02 19:46:49 +0000 | [diff] [blame] | 565 | The third argument is an index into the statepoint's list of arguments |
| 566 | which specify the (potentially) derived pointer being relocated. It |
| 567 | is legal for this index to be the same as the second argument |
| 568 | if-and-only-if a base pointer is being relocated. This index must land |
| 569 | within the 'gc parameter' section of the statepoint's argument list. |
Philip Reames | f612322 | 2014-12-02 19:37:00 +0000 | [diff] [blame] | 570 | |
| 571 | Semantics: |
| 572 | """""""""" |
Philip Reames | f612322 | 2014-12-02 19:37:00 +0000 | [diff] [blame] | 573 | |
Philip Reames | c609a59 | 2015-02-25 00:22:07 +0000 | [diff] [blame] | 574 | The return value of ``gc.relocate`` is the potentially relocated value |
Sanjoy Das | 25e71d8 | 2017-04-19 23:55:03 +0000 | [diff] [blame] | 575 | of the pointer specified by its arguments. It is unspecified how the |
Philip Reames | dfc238b | 2015-01-02 19:46:49 +0000 | [diff] [blame] | 576 | value of the returned pointer relates to the argument to the |
Philip Reames | c609a59 | 2015-02-25 00:22:07 +0000 | [diff] [blame] | 577 | ``gc.statepoint`` other than that a) it points to the same source |
Philip Reames | dfc238b | 2015-01-02 19:46:49 +0000 | [diff] [blame] | 578 | language object with the same offset, and b) the 'based-on' |
| 579 | relationship of the newly relocated pointers is a projection of the |
| 580 | unrelocated pointers. In particular, the integer value of the pointer |
| 581 | returned is unspecified. |
| 582 | |
Philip Reames | c609a59 | 2015-02-25 00:22:07 +0000 | [diff] [blame] | 583 | A ``gc.relocate`` is modeled as a ``readnone`` pure function. It has no |
Philip Reames | dfc238b | 2015-01-02 19:46:49 +0000 | [diff] [blame] | 584 | side effects since it is just a way to extract information about work |
Philip Reames | c609a59 | 2015-02-25 00:22:07 +0000 | [diff] [blame] | 585 | done during the actual call modeled by the ``gc.statepoint``. |
Philip Reames | f612322 | 2014-12-02 19:37:00 +0000 | [diff] [blame] | 586 | |
Philip Reames | e662550 | 2015-02-25 23:22:43 +0000 | [diff] [blame] | 587 | .. _statepoint-stackmap-format: |
Philip Reames | f612322 | 2014-12-02 19:37:00 +0000 | [diff] [blame] | 588 | |
Philip Reames | ce5ff37 | 2014-12-04 00:45:23 +0000 | [diff] [blame] | 589 | Stack Map Format |
Philip Reames | f612322 | 2014-12-02 19:37:00 +0000 | [diff] [blame] | 590 | ================ |
| 591 | |
Philip Reames | dfc238b | 2015-01-02 19:46:49 +0000 | [diff] [blame] | 592 | Locations for each pointer value which may need read and/or updated by |
| 593 | the runtime or collector are provided via the :ref:`Stack Map format |
| 594 | <stackmap-format>` specified in the PatchPoint documentation. |
Philip Reames | f612322 | 2014-12-02 19:37:00 +0000 | [diff] [blame] | 595 | |
| 596 | Each statepoint generates the following Locations: |
| 597 | |
Pat Gavlin | c7dc6d6ee | 2015-05-12 19:50:19 +0000 | [diff] [blame] | 598 | * Constant which describes the calling convention of the call target. This |
| 599 | constant is a valid :ref:`calling convention identifier <callingconv>` for |
| 600 | the version of LLVM used to generate the stackmap. No additional compatibility |
| 601 | guarantees are made for this constant over what LLVM provides elsewhere w.r.t. |
| 602 | these identifiers. |
| 603 | * Constant which describes the flags passed to the statepoint intrinsic |
Philip Reames | dfc238b | 2015-01-02 19:46:49 +0000 | [diff] [blame] | 604 | * Constant which describes number of following deopt *Locations* (not |
| 605 | operands) |
| 606 | * Variable number of Locations, one for each deopt parameter listed in |
Philip Reames | 95e363d | 2016-01-14 23:58:18 +0000 | [diff] [blame] | 607 | the IR statepoint (same number as described by previous Constant). At |
| 608 | the moment, only deopt parameters with a bitwidth of 64 bits or less |
| 609 | are supported. Values of a type larger than 64 bits can be specified |
| 610 | and reported only if a) the value is constant at the call site, and b) |
| 611 | the constant can be represented with less than 64 bits (assuming zero |
| 612 | extension to the original bitwidth). |
Philip Reames | 35bafee | 2016-01-15 00:13:39 +0000 | [diff] [blame] | 613 | * Variable number of relocation records, each of which consists of |
| 614 | exactly two Locations. Relocation records are described in detail |
| 615 | below. |
| 616 | |
| 617 | Each relocation record provides sufficient information for a collector to |
| 618 | relocate one or more derived pointers. Each record consists of a pair of |
| 619 | Locations. The second element in the record represents the pointer (or |
| 620 | pointers) which need updated. The first element in the record provides a |
| 621 | pointer to the base of the object with which the pointer(s) being relocated is |
| 622 | associated. This information is required for handling generalized derived |
| 623 | pointers since a pointer may be outside the bounds of the original allocation, |
| 624 | but still needs to be relocated with the allocation. Additionally: |
| 625 | |
| 626 | * It is guaranteed that the base pointer must also appear explicitly as a |
| 627 | relocation pair if used after the statepoint. |
| 628 | * There may be fewer relocation records then gc parameters in the IR |
Philip Reames | dfc238b | 2015-01-02 19:46:49 +0000 | [diff] [blame] | 629 | statepoint. Each *unique* pair will occur at least once; duplicates |
Philip Reames | 35bafee | 2016-01-15 00:13:39 +0000 | [diff] [blame] | 630 | are possible. |
| 631 | * The Locations within each record may either be of pointer size or a |
| 632 | multiple of pointer size. In the later case, the record must be |
| 633 | interpreted as describing a sequence of pointers and their corresponding |
| 634 | base pointers. If the Location is of size N x sizeof(pointer), then |
| 635 | there will be N records of one pointer each contained within the Location. |
| 636 | Both Locations in a pair can be assumed to be of the same size. |
Philip Reames | f612322 | 2014-12-02 19:37:00 +0000 | [diff] [blame] | 637 | |
Philip Reames | dfc238b | 2015-01-02 19:46:49 +0000 | [diff] [blame] | 638 | Note that the Locations used in each section may describe the same |
| 639 | physical location. e.g. A stack slot may appear as a deopt location, |
| 640 | a gc base pointer, and a gc derived pointer. |
Philip Reames | f612322 | 2014-12-02 19:37:00 +0000 | [diff] [blame] | 641 | |
Philip Reames | dfc238b | 2015-01-02 19:46:49 +0000 | [diff] [blame] | 642 | The LiveOut section of the StkMapRecord will be empty for a statepoint |
| 643 | record. |
Philip Reames | f612322 | 2014-12-02 19:37:00 +0000 | [diff] [blame] | 644 | |
| 645 | Safepoint Semantics & Verification |
| 646 | ================================== |
| 647 | |
Philip Reames | dfc238b | 2015-01-02 19:46:49 +0000 | [diff] [blame] | 648 | The fundamental correctness property for the compiled code's |
| 649 | correctness w.r.t. the garbage collector is a dynamic one. It must be |
| 650 | the case that there is no dynamic trace such that a operation |
| 651 | involving a potentially relocated pointer is observably-after a |
| 652 | safepoint which could relocate it. 'observably-after' is this usage |
| 653 | means that an outside observer could observe this sequence of events |
| 654 | in a way which precludes the operation being performed before the |
| 655 | safepoint. |
Philip Reames | f612322 | 2014-12-02 19:37:00 +0000 | [diff] [blame] | 656 | |
Philip Reames | dfc238b | 2015-01-02 19:46:49 +0000 | [diff] [blame] | 657 | To understand why this 'observable-after' property is required, |
| 658 | consider a null comparison performed on the original copy of a |
| 659 | relocated pointer. Assuming that control flow follows the safepoint, |
| 660 | there is no way to observe externally whether the null comparison is |
| 661 | performed before or after the safepoint. (Remember, the original |
| 662 | Value is unmodified by the safepoint.) The compiler is free to make |
| 663 | either scheduling choice. |
Philip Reames | f612322 | 2014-12-02 19:37:00 +0000 | [diff] [blame] | 664 | |
Philip Reames | dfc238b | 2015-01-02 19:46:49 +0000 | [diff] [blame] | 665 | The actual correctness property implemented is slightly stronger than |
| 666 | this. We require that there be no *static path* on which a |
| 667 | potentially relocated pointer is 'observably-after' it may have been |
| 668 | relocated. This is slightly stronger than is strictly necessary (and |
| 669 | thus may disallow some otherwise valid programs), but greatly |
| 670 | simplifies reasoning about correctness of the compiled code. |
Philip Reames | f612322 | 2014-12-02 19:37:00 +0000 | [diff] [blame] | 671 | |
Philip Reames | dfc238b | 2015-01-02 19:46:49 +0000 | [diff] [blame] | 672 | By construction, this property will be upheld by the optimizer if |
| 673 | correctly established in the source IR. This is a key invariant of |
| 674 | the design. |
Philip Reames | f612322 | 2014-12-02 19:37:00 +0000 | [diff] [blame] | 675 | |
Philip Reames | dfc238b | 2015-01-02 19:46:49 +0000 | [diff] [blame] | 676 | The existing IR Verifier pass has been extended to check most of the |
| 677 | local restrictions on the intrinsics mentioned in their respective |
| 678 | documentation. The current implementation in LLVM does not check the |
| 679 | key relocation invariant, but this is ongoing work on developing such |
Tanya Lattner | 0d28f80 | 2015-08-05 03:51:17 +0000 | [diff] [blame] | 680 | a verifier. Please ask on llvm-dev if you're interested in |
Philip Reames | dfc238b | 2015-01-02 19:46:49 +0000 | [diff] [blame] | 681 | experimenting with the current version. |
Philip Reames | f612322 | 2014-12-02 19:37:00 +0000 | [diff] [blame] | 682 | |
Philip Reames | c88d732 | 2015-02-25 01:23:59 +0000 | [diff] [blame] | 683 | .. _statepoint-utilities: |
| 684 | |
| 685 | Utility Passes for Safepoint Insertion |
| 686 | ====================================== |
| 687 | |
| 688 | .. _RewriteStatepointsForGC: |
| 689 | |
| 690 | RewriteStatepointsForGC |
| 691 | ^^^^^^^^^^^^^^^^^^^^^^^^ |
| 692 | |
Philip Reames | 0d98ada | 2017-04-19 23:16:13 +0000 | [diff] [blame] | 693 | The pass RewriteStatepointsForGC transforms a function's IR to lower from the |
| 694 | abstract machine model described above to the explicit statepoint model of |
| 695 | relocations. To do this, it replaces all calls or invokes of functions which |
| 696 | might contain a safepoint poll with a ``gc.statepoint`` and associated full |
| 697 | relocation sequence, including all required ``gc.relocates``. |
| 698 | |
| 699 | Note that by default, this pass only runs for the "statepoint-example" or |
| 700 | "core-clr" gc strategies. You will need to add your custom strategy to this |
| 701 | whitelist or use one of the predefined ones. |
Philip Reames | c88d732 | 2015-02-25 01:23:59 +0000 | [diff] [blame] | 702 | |
| 703 | As an example, given this code: |
| 704 | |
Nuno Lopes | e02fcee | 2017-07-26 14:11:23 +0000 | [diff] [blame] | 705 | .. code-block:: llvm |
Philip Reames | c88d732 | 2015-02-25 01:23:59 +0000 | [diff] [blame] | 706 | |
| 707 | define i8 addrspace(1)* @test1(i8 addrspace(1)* %obj) |
| 708 | gc "statepoint-example" { |
Philip Reames | 0d98ada | 2017-04-19 23:16:13 +0000 | [diff] [blame] | 709 | call void @foo() |
Philip Reames | c88d732 | 2015-02-25 01:23:59 +0000 | [diff] [blame] | 710 | ret i8 addrspace(1)* %obj |
| 711 | } |
| 712 | |
| 713 | The pass would produce this IR: |
| 714 | |
Nuno Lopes | e02fcee | 2017-07-26 14:11:23 +0000 | [diff] [blame] | 715 | .. code-block:: llvm |
Philip Reames | c88d732 | 2015-02-25 01:23:59 +0000 | [diff] [blame] | 716 | |
| 717 | define i8 addrspace(1)* @test1(i8 addrspace(1)* %obj) |
| 718 | gc "statepoint-example" { |
Chen Li | d71999e | 2015-12-26 07:54:32 +0000 | [diff] [blame] | 719 | %0 = call token (i64, i32, void ()*, i32, i32, ...)* @llvm.experimental.gc.statepoint.p0f_isVoidf(i64 2882400000, i32 0, void ()* @foo, i32 0, i32 0, i32 0, i32 5, i32 0, i32 -1, i32 0, i32 0, i32 0, i8 addrspace(1)* %obj) |
| 720 | %obj.relocated = call coldcc i8 addrspace(1)* @llvm.experimental.gc.relocate.p1i8(token %0, i32 12, i32 12) |
Philip Reames | c88d732 | 2015-02-25 01:23:59 +0000 | [diff] [blame] | 721 | ret i8 addrspace(1)* %obj.relocated |
| 722 | } |
| 723 | |
| 724 | In the above examples, the addrspace(1) marker on the pointers is the mechanism |
| 725 | that the ``statepoint-example`` GC strategy uses to distinguish references from |
Philip Reames | 0d98ada | 2017-04-19 23:16:13 +0000 | [diff] [blame] | 726 | non references. The pass assumes that all addrspace(1) pointers are non-integral |
| 727 | pointer types. Address space 1 is not globally reserved for this purpose. |
Philip Reames | c88d732 | 2015-02-25 01:23:59 +0000 | [diff] [blame] | 728 | |
| 729 | This pass can be used an utility function by a language frontend that doesn't |
| 730 | want to manually reason about liveness, base pointers, or relocation when |
| 731 | constructing IR. As currently implemented, RewriteStatepointsForGC must be |
Philip Reames | ca22b86 | 2015-08-26 23:13:35 +0000 | [diff] [blame] | 732 | run after SSA construction (i.e. mem2ref). |
Philip Reames | c88d732 | 2015-02-25 01:23:59 +0000 | [diff] [blame] | 733 | |
Philip Reames | ca22b86 | 2015-08-26 23:13:35 +0000 | [diff] [blame] | 734 | RewriteStatepointsForGC will ensure that appropriate base pointers are listed |
| 735 | for every relocation created. It will do so by duplicating code as needed to |
| 736 | propagate the base pointer associated with each pointer being relocated to |
| 737 | the appropriate safepoints. The implementation assumes that the following |
| 738 | IR constructs produce base pointers: loads from the heap, addresses of global |
| 739 | variables, function arguments, function return values. Constant pointers (such |
| 740 | as null) are also assumed to be base pointers. In practice, this constraint |
| 741 | can be relaxed to producing interior derived pointers provided the target |
| 742 | collector can find the associated allocation from an arbitrary interior |
| 743 | derived pointer. |
Philip Reames | c88d732 | 2015-02-25 01:23:59 +0000 | [diff] [blame] | 744 | |
Philip Reames | 0d98ada | 2017-04-19 23:16:13 +0000 | [diff] [blame] | 745 | By default RewriteStatepointsForGC passes in ``0xABCDEF00`` as the statepoint |
| 746 | ID and ``0`` as the number of patchable bytes to the newly constructed |
| 747 | ``gc.statepoint``. These values can be configured on a per-callsite |
| 748 | basis using the attributes ``"statepoint-id"`` and |
| 749 | ``"statepoint-num-patch-bytes"``. If a call site is marked with a |
| 750 | ``"statepoint-id"`` function attribute and its value is a positive |
| 751 | integer (represented as a string), then that value is used as the ID |
| 752 | of the newly constructed ``gc.statepoint``. If a call site is marked |
| 753 | with a ``"statepoint-num-patch-bytes"`` function attribute and its |
| 754 | value is a positive integer, then that value is used as the 'num patch |
| 755 | bytes' parameter of the newly constructed ``gc.statepoint``. The |
| 756 | ``"statepoint-id"`` and ``"statepoint-num-patch-bytes"`` attributes |
| 757 | are not propagated to the ``gc.statepoint`` call or invoke if they |
| 758 | could be successfully parsed. |
| 759 | |
| 760 | In practice, RewriteStatepointsForGC should be run much later in the pass |
Philip Reames | c88d732 | 2015-02-25 01:23:59 +0000 | [diff] [blame] | 761 | pipeline, after most optimization is already done. This helps to improve |
| 762 | the quality of the generated code when compiled with garbage collection support. |
Philip Reames | c88d732 | 2015-02-25 01:23:59 +0000 | [diff] [blame] | 763 | |
| 764 | .. _PlaceSafepoints: |
| 765 | |
| 766 | PlaceSafepoints |
| 767 | ^^^^^^^^^^^^^^^^ |
| 768 | |
Philip Reames | 0d98ada | 2017-04-19 23:16:13 +0000 | [diff] [blame] | 769 | The pass PlaceSafepoints inserts safepoint polls sufficient to ensure running |
| 770 | code checks for a safepoint request on a timely manner. This pass is expected |
| 771 | to be run before RewriteStatepointsForGC and thus does not produce full |
| 772 | relocation sequences. |
Philip Reames | c88d732 | 2015-02-25 01:23:59 +0000 | [diff] [blame] | 773 | |
Philip Reames | 5017ab5 | 2015-02-26 01:18:21 +0000 | [diff] [blame] | 774 | As an example, given input IR of the following: |
| 775 | |
| 776 | .. code-block:: llvm |
| 777 | |
| 778 | define void @test() gc "statepoint-example" { |
| 779 | call void @foo() |
| 780 | ret void |
| 781 | } |
| 782 | |
| 783 | declare void @do_safepoint() |
| 784 | define void @gc.safepoint_poll() { |
| 785 | call void @do_safepoint() |
| 786 | ret void |
| 787 | } |
| 788 | |
| 789 | |
| 790 | This pass would produce the following IR: |
| 791 | |
Nuno Lopes | e02fcee | 2017-07-26 14:11:23 +0000 | [diff] [blame] | 792 | .. code-block:: llvm |
Philip Reames | 5017ab5 | 2015-02-26 01:18:21 +0000 | [diff] [blame] | 793 | |
| 794 | define void @test() gc "statepoint-example" { |
Philip Reames | 0d98ada | 2017-04-19 23:16:13 +0000 | [diff] [blame] | 795 | call void @do_safepoint() |
| 796 | call void @foo() |
Philip Reames | 5017ab5 | 2015-02-26 01:18:21 +0000 | [diff] [blame] | 797 | ret void |
| 798 | } |
| 799 | |
Philip Reames | 0d98ada | 2017-04-19 23:16:13 +0000 | [diff] [blame] | 800 | In this case, we've added an (unconditional) entry safepoint poll. Note that |
| 801 | despite appearances, the entry poll is not necessarily redundant. We'd have to |
| 802 | know that ``foo`` and ``test`` were not mutually recursive for the poll to be |
| 803 | redundant. In practice, you'd probably want to your poll definition to contain |
| 804 | a conditional branch of some form. |
Philip Reames | 5017ab5 | 2015-02-26 01:18:21 +0000 | [diff] [blame] | 805 | |
Philip Reames | c88d732 | 2015-02-25 01:23:59 +0000 | [diff] [blame] | 806 | At the moment, PlaceSafepoints can insert safepoint polls at method entry and |
| 807 | loop backedges locations. Extending this to work with return polls would be |
| 808 | straight forward if desired. |
| 809 | |
| 810 | PlaceSafepoints includes a number of optimizations to avoid placing safepoint |
| 811 | polls at particular sites unless needed to ensure timely execution of a poll |
| 812 | under normal conditions. PlaceSafepoints does not attempt to ensure timely |
| 813 | execution of a poll under worst case conditions such as heavy system paging. |
| 814 | |
| 815 | The implementation of a safepoint poll action is specified by looking up a |
| 816 | function of the name ``gc.safepoint_poll`` in the containing Module. The body |
| 817 | of this function is inserted at each poll site desired. While calls or invokes |
| 818 | inside this method are transformed to a ``gc.statepoints``, recursive poll |
| 819 | insertion is not performed. |
| 820 | |
Philip Reames | 0d98ada | 2017-04-19 23:16:13 +0000 | [diff] [blame] | 821 | This pass is useful for any language frontend which only has to support |
| 822 | garbage collection semantics at safepoints. If you need other abstract |
| 823 | frame information at safepoints (e.g. for deoptimization or introspection), |
| 824 | you can insert safepoint polls in the frontend. If you have the later case, |
| 825 | please ask on llvm-dev for suggestions. There's been a good amount of work |
| 826 | done on making such a scheme work well in practice which is not yet documented |
| 827 | here. |
Philip Reames | c88d732 | 2015-02-25 01:23:59 +0000 | [diff] [blame] | 828 | |
| 829 | |
Philip Reames | b773631 | 2015-07-16 21:10:46 +0000 | [diff] [blame] | 830 | Supported Architectures |
| 831 | ======================= |
| 832 | |
| 833 | Support for statepoint generation requires some code for each backend. |
| 834 | Today, only X86_64 is supported. |
| 835 | |
Philip Reames | 2e7383c | 2016-03-03 23:24:44 +0000 | [diff] [blame] | 836 | Problem Areas and Active Work |
| 837 | ============================= |
| 838 | |
Sanjoy Das | fefc4d5 | 2016-03-04 18:14:09 +0000 | [diff] [blame] | 839 | #. Support for languages which allow unmanaged pointers to garbage collected |
Philip Reames | 2e7383c | 2016-03-03 23:24:44 +0000 | [diff] [blame] | 840 | objects (i.e. pass a pointer to an object to a C routine) via pinning. |
| 841 | |
| 842 | #. Support for garbage collected objects allocated on the stack. Specifically, |
Sanjoy Das | fefc4d5 | 2016-03-04 18:14:09 +0000 | [diff] [blame] | 843 | allocas are always assumed to be in address space 0 and we need a |
| 844 | cast/promotion operator to let rewriting identify them. |
Philip Reames | 2e7383c | 2016-03-03 23:24:44 +0000 | [diff] [blame] | 845 | |
Sanjoy Das | fefc4d5 | 2016-03-04 18:14:09 +0000 | [diff] [blame] | 846 | #. The current statepoint lowering is known to be somewhat poor. In the very |
| 847 | long term, we'd like to integrate statepoints with the register allocator; |
| 848 | in the near term this is unlikely to happen. We've found the quality of |
| 849 | lowering to be relatively unimportant as hot-statepoints are almost always |
| 850 | inliner bugs. |
Philip Reames | 2e7383c | 2016-03-03 23:24:44 +0000 | [diff] [blame] | 851 | |
Sanjoy Das | fefc4d5 | 2016-03-04 18:14:09 +0000 | [diff] [blame] | 852 | #. Concerns have been raised that the statepoint representation results in a |
| 853 | large amount of IR being produced for some examples and that this |
Philip Reames | 2e7383c | 2016-03-03 23:24:44 +0000 | [diff] [blame] | 854 | contributes to higher than expected memory usage and compile times. There's |
Sanjoy Das | fefc4d5 | 2016-03-04 18:14:09 +0000 | [diff] [blame] | 855 | no immediate plans to make changes due to this, but alternate models may be |
Philip Reames | 2e7383c | 2016-03-03 23:24:44 +0000 | [diff] [blame] | 856 | explored in the future. |
| 857 | |
Sanjoy Das | fefc4d5 | 2016-03-04 18:14:09 +0000 | [diff] [blame] | 858 | #. Relocations along exceptional paths are currently broken in ToT. In |
| 859 | particular, there is current no way to represent a rethrow on a path which |
| 860 | also has relocations. See `this llvm-dev discussion |
| 861 | <https://groups.google.com/forum/#!topic/llvm-dev/AE417XjgxvI>`_ for more |
| 862 | detail. |
Philip Reames | 2e7383c | 2016-03-03 23:24:44 +0000 | [diff] [blame] | 863 | |
Philip Reames | 8333152 | 2014-12-04 18:33:28 +0000 | [diff] [blame] | 864 | Bugs and Enhancements |
| 865 | ===================== |
Philip Reames | dfc238b | 2015-01-02 19:46:49 +0000 | [diff] [blame] | 866 | |
| 867 | Currently known bugs and enhancements under consideration can be |
| 868 | tracked by performing a `bugzilla search |
Ismail Donmez | c7ff814 | 2017-02-17 08:26:11 +0000 | [diff] [blame] | 869 | <https://bugs.llvm.org/buglist.cgi?cmdtype=runnamed&namedcmd=Statepoint%20Bugs&list_id=64342>`_ |
Philip Reames | dfc238b | 2015-01-02 19:46:49 +0000 | [diff] [blame] | 870 | for [Statepoint] in the summary field. When filing new bugs, please |
| 871 | use this tag so that interested parties see the newly filed bug. As |
Tanya Lattner | 0d28f80 | 2015-08-05 03:51:17 +0000 | [diff] [blame] | 872 | with most LLVM features, design discussions take place on `llvm-dev |
| 873 | <http://lists.llvm.org/mailman/listinfo/llvm-dev>`_, and patches |
Philip Reames | dfc238b | 2015-01-02 19:46:49 +0000 | [diff] [blame] | 874 | should be sent to `llvm-commits |
Tanya Lattner | 0d28f80 | 2015-08-05 03:51:17 +0000 | [diff] [blame] | 875 | <http://lists.llvm.org/mailman/listinfo/llvm-commits>`_ for review. |
Philip Reames | 8333152 | 2014-12-04 18:33:28 +0000 | [diff] [blame] | 876 | |