| Andrew Trick | 5e029ce | 2013-12-24 02:57:25 +0000 | [diff] [blame] | 1 | =================================== | 
|  | 2 | Stack maps and patch points in LLVM | 
|  | 3 | =================================== | 
|  | 4 |  | 
|  | 5 | .. contents:: | 
|  | 6 | :local: | 
|  | 7 | :depth: 2 | 
|  | 8 |  | 
|  | 9 | Definitions | 
|  | 10 | =========== | 
|  | 11 |  | 
|  | 12 | In this document we refer to the "runtime" collectively as all | 
|  | 13 | components that serve as the LLVM client, including the LLVM IR | 
|  | 14 | generator, object code consumer, and code patcher. | 
|  | 15 |  | 
|  | 16 | A stack map records the location of ``live values`` at a particular | 
|  | 17 | instruction address. These ``live values`` do not refer to all the | 
|  | 18 | LLVM values live across the stack map. Instead, they are only the | 
|  | 19 | values that the runtime requires to be live at this point. For | 
|  | 20 | example, they may be the values the runtime will need to resume | 
|  | 21 | program execution at that point independent of the compiled function | 
|  | 22 | containing the stack map. | 
|  | 23 |  | 
|  | 24 | LLVM emits stack map data into the object code within a designated | 
|  | 25 | :ref:`stackmap-section`. This stack map data contains a record for | 
|  | 26 | each stack map. The record stores the stack map's instruction address | 
|  | 27 | and contains a entry for each mapped value. Each entry encodes a | 
|  | 28 | value's location as a register, stack offset, or constant. | 
|  | 29 |  | 
|  | 30 | A patch point is an instruction address at which space is reserved for | 
|  | 31 | patching a new instruction sequence at run time. Patch points look | 
|  | 32 | much like calls to LLVM. They take arguments that follow a calling | 
|  | 33 | convention and may return a value. They also imply stack map | 
|  | 34 | generation, which allows the runtime to locate the patchpoint and | 
|  | 35 | find the location of ``live values`` at that point. | 
|  | 36 |  | 
|  | 37 | Motivation | 
|  | 38 | ========== | 
|  | 39 |  | 
|  | 40 | This functionality is currently experimental but is potentially useful | 
|  | 41 | in a variety of settings, the most obvious being a runtime (JIT) | 
|  | 42 | compiler. Example applications of the patchpoint intrinsics are | 
|  | 43 | implementing an inline call cache for polymorphic method dispatch or | 
|  | 44 | optimizing the retrieval of properties in dynamically typed languages | 
|  | 45 | such as JavaScript. | 
|  | 46 |  | 
|  | 47 | The intrinsics documented here are currently used by the JavaScript | 
|  | 48 | compiler within the open source WebKit project, see the `FTL JIT | 
|  | 49 | <https://trac.webkit.org/wiki/FTLJIT>`_, but they are designed to be | 
|  | 50 | used whenever stack maps or code patching are needed. Because the | 
|  | 51 | intrinsics have experimental status, compatibility across LLVM | 
|  | 52 | releases is not guaranteed. | 
|  | 53 |  | 
|  | 54 | The stack map functionality described in this document is separate | 
|  | 55 | from the functionality described in | 
|  | 56 | :ref:`stack-map`. `GCFunctionMetadata` provides the location of | 
|  | 57 | pointers into a collected heap captured by the `GCRoot` intrinsic, | 
|  | 58 | which can also be considered a "stack map". Unlike the stack maps | 
|  | 59 | defined above, the `GCFunctionMetadata` stack map interface does not | 
|  | 60 | provide a way to associate live register values of arbitrary type with | 
|  | 61 | an instruction address, nor does it specify a format for the resulting | 
|  | 62 | stack map. The stack maps described here could potentially provide | 
|  | 63 | richer information to a garbage collecting runtime, but that usage | 
|  | 64 | will not be discussed in this document. | 
|  | 65 |  | 
|  | 66 | Intrinsics | 
|  | 67 | ========== | 
|  | 68 |  | 
|  | 69 | The following two kinds of intrinsics can be used to implement stack | 
|  | 70 | maps and patch points: ``llvm.experimental.stackmap`` and | 
|  | 71 | ``llvm.experimental.patchpoint``. Both kinds of intrinsics generate a | 
|  | 72 | stack map record, and they both allow some form of code patching. They | 
|  | 73 | can be used independently (i.e. ``llvm.experimental.patchpoint`` | 
|  | 74 | implicitly generates a stack map without the need for an additional | 
|  | 75 | call to ``llvm.experimental.stackmap``). The choice of which to use | 
|  | 76 | depends on whether it is necessary to reserve space for code patching | 
|  | 77 | and whether any of the intrinsic arguments should be lowered according | 
|  | 78 | to calling conventions. ``llvm.experimental.stackmap`` does not | 
|  | 79 | reserve any space, nor does it expect any call arguments. If the | 
|  | 80 | runtime patches code at the stack map's address, it will destructively | 
|  | 81 | overwrite the program text. This is unlike | 
|  | 82 | ``llvm.experimental.patchpoint``, which reserves space for in-place | 
|  | 83 | patching without overwriting surrounding code. The | 
|  | 84 | ``llvm.experimental.patchpoint`` intrinsic also lowers a specified | 
|  | 85 | number of arguments according to its calling convention. This allows | 
|  | 86 | patched code to make in-place function calls without marshaling. | 
|  | 87 |  | 
|  | 88 | Each instance of one of these intrinsics generates a stack map record | 
|  | 89 | in the :ref:`stackmap-section`. The record includes an ID, allowing | 
|  | 90 | the runtime to uniquely identify the stack map, and the offset within | 
|  | 91 | the code from the beginning of the enclosing function. | 
|  | 92 |  | 
|  | 93 | '``llvm.experimental.stackmap``' Intrinsic | 
|  | 94 | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ | 
|  | 95 |  | 
|  | 96 | Syntax: | 
|  | 97 | """"""" | 
|  | 98 |  | 
|  | 99 | :: | 
|  | 100 |  | 
|  | 101 | declare void | 
|  | 102 | @llvm.experimental.stackmap(i64 <id>, i32 <numShadowBytes>, ...) | 
|  | 103 |  | 
|  | 104 | Overview: | 
|  | 105 | """"""""" | 
|  | 106 |  | 
|  | 107 | The '``llvm.experimental.stackmap``' intrinsic records the location of | 
|  | 108 | specified values in the stack map without generating any code. | 
|  | 109 |  | 
|  | 110 | Operands: | 
|  | 111 | """"""""" | 
|  | 112 |  | 
|  | 113 | The first operand is an ID to be encoded within the stack map. The | 
|  | 114 | second operand is the number of shadow bytes following the | 
|  | 115 | intrinsic. The variable number of operands that follow are the ``live | 
|  | 116 | values`` for which locations will be recorded in the stack map. | 
|  | 117 |  | 
|  | 118 | To use this intrinsic as a bare-bones stack map, with no code patching | 
|  | 119 | support, the number of shadow bytes can be set to zero. | 
|  | 120 |  | 
|  | 121 | Semantics: | 
|  | 122 | """""""""" | 
|  | 123 |  | 
|  | 124 | The stack map intrinsic generates no code in place, unless nops are | 
|  | 125 | needed to cover its shadow (see below). However, its offset from | 
|  | 126 | function entry is stored in the stack map. This is the relative | 
|  | 127 | instruction address immediately following the instructions that | 
|  | 128 | precede the stack map. | 
|  | 129 |  | 
|  | 130 | The stack map ID allows a runtime to locate the desired stack map | 
|  | 131 | record. LLVM passes this ID through directly to the stack map | 
|  | 132 | record without checking uniqueness. | 
|  | 133 |  | 
|  | 134 | LLVM guarantees a shadow of instructions following the stack map's | 
|  | 135 | instruction offset during which neither the end of the basic block nor | 
|  | 136 | another call to ``llvm.experimental.stackmap`` or | 
|  | 137 | ``llvm.experimental.patchpoint`` may occur. This allows the runtime to | 
|  | 138 | patch the code at this point in response to an event triggered from | 
|  | 139 | outside the code. The code for instructions following the stack map | 
|  | 140 | may be emitted in the stack map's shadow, and these instructions may | 
|  | 141 | be overwritten by destructive patching. Without shadow bytes, this | 
|  | 142 | destructive patching could overwrite program text or data outside the | 
|  | 143 | current function. We disallow overlapping stack map shadows so that | 
|  | 144 | the runtime does not need to consider this corner case. | 
|  | 145 |  | 
|  | 146 | For example, a stack map with 8 byte shadow: | 
|  | 147 |  | 
|  | 148 | .. code-block:: llvm | 
|  | 149 |  | 
|  | 150 | call void @runtime() | 
|  | 151 | call void (i64, i32, ...)* @llvm.experimental.stackmap(i64 77, i32 8, | 
|  | 152 | i64* %ptr) | 
|  | 153 | %val = load i64* %ptr | 
|  | 154 | %add = add i64 %val, 3 | 
|  | 155 | ret i64 %add | 
|  | 156 |  | 
|  | 157 | May require one byte of nop-padding: | 
|  | 158 |  | 
|  | 159 | .. code-block:: none | 
|  | 160 |  | 
|  | 161 | 0x00 callq _runtime | 
|  | 162 | 0x05 nop                <--- stack map address | 
|  | 163 | 0x06 movq (%rdi), %rax | 
|  | 164 | 0x07 addq $3, %rax | 
|  | 165 | 0x0a popq %rdx | 
|  | 166 | 0x0b ret                <---- end of 8-byte shadow | 
|  | 167 |  | 
|  | 168 | Now, if the runtime needs to invalidate the compiled code, it may | 
|  | 169 | patch 8 bytes of code at the stack map's address at follows: | 
|  | 170 |  | 
|  | 171 | .. code-block:: none | 
|  | 172 |  | 
|  | 173 | 0x00 callq _runtime | 
|  | 174 | 0x05 movl  $0xffff, %rax <--- patched code at stack map address | 
|  | 175 | 0x0a callq *%rax         <---- end of 8-byte shadow | 
|  | 176 |  | 
|  | 177 | This way, after the normal call to the runtime returns, the code will | 
|  | 178 | execute a patched call to a special entry point that can rebuild a | 
|  | 179 | stack frame from the values located by the stack map. | 
|  | 180 |  | 
|  | 181 | '``llvm.experimental.patchpoint.*``' Intrinsic | 
|  | 182 | ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ | 
|  | 183 |  | 
|  | 184 | Syntax: | 
|  | 185 | """"""" | 
|  | 186 |  | 
|  | 187 | :: | 
|  | 188 |  | 
|  | 189 | declare void | 
|  | 190 | @llvm.experimental.patchpoint.void(i64 <id>, i32 <numBytes>, | 
|  | 191 | i8* <target>, i32 <numArgs>, ...) | 
|  | 192 | declare i64 | 
|  | 193 | @llvm.experimental.patchpoint.i64(i64 <id>, i32 <numBytes>, | 
|  | 194 | i8* <target>, i32 <numArgs>, ...) | 
|  | 195 |  | 
|  | 196 | Overview: | 
|  | 197 | """"""""" | 
|  | 198 |  | 
|  | 199 | The '``llvm.experimental.patchpoint.*``' intrinsics creates a function | 
|  | 200 | call to the specified ``<target>`` and records the location of specified | 
|  | 201 | values in the stack map. | 
|  | 202 |  | 
|  | 203 | Operands: | 
|  | 204 | """"""""" | 
|  | 205 |  | 
|  | 206 | The first operand is an ID, the second operand is the number of bytes | 
|  | 207 | reserved for the patchable region, the third operand is the target | 
|  | 208 | address of a function (optionally null), and the fourth operand | 
|  | 209 | specifies how many of the following variable operands are considered | 
|  | 210 | function call arguments. The remaining variable number of operands are | 
|  | 211 | the ``live values`` for which locations will be recorded in the stack | 
|  | 212 | map. | 
|  | 213 |  | 
|  | 214 | Semantics: | 
|  | 215 | """""""""" | 
|  | 216 |  | 
|  | 217 | The patch point intrinsic generates a stack map. It also emits a | 
|  | 218 | function call to the address specified by ``<target>`` if the address | 
|  | 219 | is not a constant null. The function call and its arguments are | 
|  | 220 | lowered according to the calling convention specified at the | 
|  | 221 | intrinsic's callsite. Variants of the intrinsic with non-void return | 
|  | 222 | type also return a value according to calling convention. | 
|  | 223 |  | 
| Hal Finkel | 9bbad03 | 2015-07-14 22:26:06 +0000 | [diff] [blame] | 224 | On PowerPC, note that ``<target>`` must be the ABI function pointer for the | 
|  | 225 | intended target of the indirect call. Specifically, when compiling for the | 
|  | 226 | ELF V1 ABI, ``<target>`` is the function-descriptor address normally used as | 
|  | 227 | the C/C++ function-pointer representation. | 
| Hal Finkel | 934361a | 2015-01-14 01:07:51 +0000 | [diff] [blame] | 228 |  | 
| Andrew Trick | 5e029ce | 2013-12-24 02:57:25 +0000 | [diff] [blame] | 229 | Requesting zero patch point arguments is valid. In this case, all | 
|  | 230 | variable operands are handled just like | 
|  | 231 | ``llvm.experimental.stackmap.*``. The difference is that space will | 
|  | 232 | still be reserved for patching, a call will be emitted, and a return | 
|  | 233 | value is allowed. | 
|  | 234 |  | 
|  | 235 | The location of the arguments are not normally recorded in the stack | 
|  | 236 | map because they are already fixed by the calling convention. The | 
|  | 237 | remaining ``live values`` will have their location recorded, which | 
|  | 238 | could be a register, stack location, or constant. A special calling | 
|  | 239 | convention has been introduced for use with stack maps, anyregcc, | 
|  | 240 | which forces the arguments to be loaded into registers but allows | 
|  | 241 | those register to be dynamically allocated. These argument registers | 
|  | 242 | will have their register locations recorded in the stack map in | 
|  | 243 | addition to the remaining ``live values``. | 
|  | 244 |  | 
|  | 245 | The patch point also emits nops to cover at least ``<numBytes>`` of | 
|  | 246 | instruction encoding space. Hence, the client must ensure that | 
|  | 247 | ``<numBytes>`` is enough to encode a call to the target address on the | 
|  | 248 | supported targets. If the call target is constant null, then there is | 
|  | 249 | no minimum requirement. A zero-byte null target patchpoint is | 
|  | 250 | valid. | 
|  | 251 |  | 
|  | 252 | The runtime may patch the code emitted for the patch point, including | 
|  | 253 | the call sequence and nops. However, the runtime may not assume | 
|  | 254 | anything about the code LLVM emits within the reserved space. Partial | 
|  | 255 | patching is not allowed. The runtime must patch all reserved bytes, | 
|  | 256 | padding with nops if necessary. | 
|  | 257 |  | 
|  | 258 | This example shows a patch point reserving 15 bytes, with one argument | 
|  | 259 | in $rdi, and a return value in $rax per native calling convention: | 
|  | 260 |  | 
|  | 261 | .. code-block:: llvm | 
|  | 262 |  | 
|  | 263 | %target = inttoptr i64 -281474976710654 to i8* | 
|  | 264 | %val = call i64 (i64, i32, ...)* | 
|  | 265 | @llvm.experimental.patchpoint.i64(i64 78, i32 15, | 
|  | 266 | i8* %target, i32 1, i64* %ptr) | 
|  | 267 | %add = add i64 %val, 3 | 
|  | 268 | ret i64 %add | 
|  | 269 |  | 
|  | 270 | May generate: | 
|  | 271 |  | 
|  | 272 | .. code-block:: none | 
|  | 273 |  | 
|  | 274 | 0x00 movabsq $0xffff000000000002, %r11 <--- patch point address | 
|  | 275 | 0x0a callq   *%r11 | 
|  | 276 | 0x0d nop | 
|  | 277 | 0x0e nop                               <--- end of reserved 15-bytes | 
|  | 278 | 0x0f addq    $0x3, %rax | 
|  | 279 | 0x10 movl    %rax, 8(%rsp) | 
|  | 280 |  | 
|  | 281 | Note that no stack map locations will be recorded. If the patched code | 
|  | 282 | sequence does not need arguments fixed to specific calling convention | 
|  | 283 | registers, then the ``anyregcc`` convention may be used: | 
|  | 284 |  | 
|  | 285 | .. code-block:: none | 
|  | 286 |  | 
|  | 287 | %val = call anyregcc @llvm.experimental.patchpoint(i64 78, i32 15, | 
|  | 288 | i8* %target, i32 1, | 
|  | 289 | i64* %ptr) | 
|  | 290 |  | 
|  | 291 | The stack map now indicates the location of the %ptr argument and | 
|  | 292 | return value: | 
|  | 293 |  | 
|  | 294 | .. code-block:: none | 
|  | 295 |  | 
|  | 296 | Stack Map: ID=78, Loc0=%r9 Loc1=%r8 | 
|  | 297 |  | 
|  | 298 | The patch code sequence may now use the argument that happened to be | 
|  | 299 | allocated in %r8 and return a value allocated in %r9: | 
|  | 300 |  | 
|  | 301 | .. code-block:: none | 
|  | 302 |  | 
|  | 303 | 0x00 movslq 4(%r8) %r9              <--- patched code at patch point address | 
|  | 304 | 0x03 nop | 
|  | 305 | ... | 
|  | 306 | 0x0e nop                            <--- end of reserved 15-bytes | 
|  | 307 | 0x0f addq    $0x3, %r9 | 
|  | 308 | 0x10 movl    %r9, 8(%rsp) | 
|  | 309 |  | 
|  | 310 | .. _stackmap-format: | 
|  | 311 |  | 
|  | 312 | Stack Map Format | 
|  | 313 | ================ | 
|  | 314 |  | 
|  | 315 | The existence of a stack map or patch point intrinsic within an LLVM | 
|  | 316 | Module forces code emission to create a :ref:`stackmap-section`. The | 
|  | 317 | format of this section follows: | 
|  | 318 |  | 
|  | 319 | .. code-block:: none | 
|  | 320 |  | 
| Juergen Ributzka | e117992 | 2014-03-31 22:14:04 +0000 | [diff] [blame] | 321 | Header { | 
| Sanjoy Das | 23f06e5 | 2016-09-14 20:22:03 +0000 | [diff] [blame] | 322 | uint8  : Stack Map Version (current version is 2) | 
| Juergen Ributzka | e117992 | 2014-03-31 22:14:04 +0000 | [diff] [blame] | 323 | uint8  : Reserved (expected to be 0) | 
|  | 324 | uint16 : Reserved (expected to be 0) | 
| Juergen Ributzka | fb4d648 | 2014-01-30 18:58:27 +0000 | [diff] [blame] | 325 | } | 
| Juergen Ributzka | e117992 | 2014-03-31 22:14:04 +0000 | [diff] [blame] | 326 | uint32 : NumFunctions | 
| Andrew Trick | 5e029ce | 2013-12-24 02:57:25 +0000 | [diff] [blame] | 327 | uint32 : NumConstants | 
| Juergen Ributzka | e117992 | 2014-03-31 22:14:04 +0000 | [diff] [blame] | 328 | uint32 : NumRecords | 
|  | 329 | StkSizeRecord[NumFunctions] { | 
|  | 330 | uint64 : Function Address | 
|  | 331 | uint64 : Stack Size | 
| Sanjoy Das | 23f06e5 | 2016-09-14 20:22:03 +0000 | [diff] [blame] | 332 | uint64 : Record Count | 
| Juergen Ributzka | e117992 | 2014-03-31 22:14:04 +0000 | [diff] [blame] | 333 | } | 
| Andrew Trick | 5e029ce | 2013-12-24 02:57:25 +0000 | [diff] [blame] | 334 | Constants[NumConstants] { | 
|  | 335 | uint64 : LargeConstant | 
|  | 336 | } | 
| Andrew Trick | 5e029ce | 2013-12-24 02:57:25 +0000 | [diff] [blame] | 337 | StkMapRecord[NumRecords] { | 
|  | 338 | uint64 : PatchPoint ID | 
|  | 339 | uint32 : Instruction Offset | 
|  | 340 | uint16 : Reserved (record flags) | 
|  | 341 | uint16 : NumLocations | 
|  | 342 | Location[NumLocations] { | 
|  | 343 | uint8  : Register | Direct | Indirect | Constant | ConstantIndex | 
|  | 344 | uint8  : Reserved (location flags) | 
|  | 345 | uint16 : Dwarf RegNum | 
|  | 346 | int32  : Offset or SmallConstant | 
|  | 347 | } | 
| Juergen Ributzka | e117992 | 2014-03-31 22:14:04 +0000 | [diff] [blame] | 348 | uint16 : Padding | 
| Andrew Trick | 5e029ce | 2013-12-24 02:57:25 +0000 | [diff] [blame] | 349 | uint16 : NumLiveOuts | 
|  | 350 | LiveOuts[NumLiveOuts] | 
|  | 351 | uint16 : Dwarf RegNum | 
|  | 352 | uint8  : Reserved | 
|  | 353 | uint8  : Size in Bytes | 
|  | 354 | } | 
| Juergen Ributzka | e117992 | 2014-03-31 22:14:04 +0000 | [diff] [blame] | 355 | uint32 : Padding (only if required to align to 8 byte) | 
| Andrew Trick | 5e029ce | 2013-12-24 02:57:25 +0000 | [diff] [blame] | 356 | } | 
|  | 357 |  | 
|  | 358 | The first byte of each location encodes a type that indicates how to | 
|  | 359 | interpret the ``RegNum`` and ``Offset`` fields as follows: | 
|  | 360 |  | 
|  | 361 | ======== ========== =================== =========================== | 
|  | 362 | Encoding Type       Value               Description | 
|  | 363 | -------- ---------- ------------------- --------------------------- | 
|  | 364 | 0x1      Register   Reg                 Value in a register | 
|  | 365 | 0x2      Direct     Reg + Offset        Frame index value | 
|  | 366 | 0x3      Indirect   [Reg + Offset]      Spilled value | 
|  | 367 | 0x4      Constant   Offset              Small constant | 
|  | 368 | 0x5      ConstIndex Constants[Offset]   Large constant | 
|  | 369 | ======== ========== =================== =========================== | 
|  | 370 |  | 
|  | 371 | In the common case, a value is available in a register, and the | 
|  | 372 | ``Offset`` field will be zero. Values spilled to the stack are encoded | 
|  | 373 | as ``Indirect`` locations. The runtime must load those values from a | 
|  | 374 | stack address, typically in the form ``[BP + Offset]``. If an | 
|  | 375 | ``alloca`` value is passed directly to a stack map intrinsic, then | 
|  | 376 | LLVM may fold the frame index into the stack map as an optimization to | 
|  | 377 | avoid allocating a register or stack slot. These frame indices will be | 
|  | 378 | encoded as ``Direct`` locations in the form ``BP + Offset``. LLVM may | 
|  | 379 | also optimize constants by emitting them directly in the stack map, | 
|  | 380 | either in the ``Offset`` of a ``Constant`` location or in the constant | 
|  | 381 | pool, referred to by ``ConstantIndex`` locations. | 
|  | 382 |  | 
|  | 383 | At each callsite, a "liveout" register list is also recorded. These | 
|  | 384 | are the registers that are live across the stackmap and therefore must | 
|  | 385 | be saved by the runtime. This is an important optimization when the | 
|  | 386 | patchpoint intrinsic is used with a calling convention that by default | 
|  | 387 | preserves most registers as callee-save. | 
|  | 388 |  | 
|  | 389 | Each entry in the liveout register list contains a DWARF register | 
|  | 390 | number and size in bytes. The stackmap format deliberately omits | 
|  | 391 | specific subregister information. Instead the runtime must interpret | 
|  | 392 | this information conservatively. For example, if the stackmap reports | 
|  | 393 | one byte at ``%rax``, then the value may be in either ``%al`` or | 
|  | 394 | ``%ah``. It doesn't matter in practice, because the runtime will | 
|  | 395 | simply save ``%rax``. However, if the stackmap reports 16 bytes at | 
|  | 396 | ``%ymm0``, then the runtime can safely optimize by saving only | 
|  | 397 | ``%xmm0``. | 
|  | 398 |  | 
|  | 399 | The stack map format is a contract between an LLVM SVN revision and | 
|  | 400 | the runtime. It is currently experimental and may change in the short | 
|  | 401 | term, but minimizing the need to update the runtime is | 
|  | 402 | important. Consequently, the stack map design is motivated by | 
|  | 403 | simplicity and extensibility. Compactness of the representation is | 
|  | 404 | secondary because the runtime is expected to parse the data | 
|  | 405 | immediately after compiling a module and encode the information in its | 
|  | 406 | own format. Since the runtime controls the allocation of sections, it | 
|  | 407 | can reuse the same stack map space for multiple modules. | 
|  | 408 |  | 
| Andrew Trick | da97149 | 2014-04-03 07:08:21 +0000 | [diff] [blame] | 409 | Stackmap support is currently only implemented for 64-bit | 
|  | 410 | platforms. However, a 32-bit implementation should be able to use the | 
|  | 411 | same format with an insignificant amount of wasted space. | 
| Andrew Trick | f51ee3c | 2014-04-03 07:03:28 +0000 | [diff] [blame] | 412 |  | 
| Andrew Trick | 5e029ce | 2013-12-24 02:57:25 +0000 | [diff] [blame] | 413 | .. _stackmap-section: | 
|  | 414 |  | 
|  | 415 | Stack Map Section | 
|  | 416 | ^^^^^^^^^^^^^^^^^ | 
|  | 417 |  | 
|  | 418 | A JIT compiler can easily access this section by providing its own | 
|  | 419 | memory manager via the LLVM C API | 
|  | 420 | ``LLVMCreateSimpleMCJITMemoryManager()``. When creating the memory | 
|  | 421 | manager, the JIT provides a callback: | 
|  | 422 | ``LLVMMemoryManagerAllocateDataSectionCallback()``. When LLVM creates | 
|  | 423 | this section, it invokes the callback and passes the section name. The | 
|  | 424 | JIT can record the in-memory address of the section at this time and | 
|  | 425 | later parse it to recover the stack map data. | 
|  | 426 |  | 
|  | 427 | On Darwin, the stack map section name is "__llvm_stackmaps". The | 
|  | 428 | segment name is "__LLVM_STACKMAPS". | 
|  | 429 |  | 
|  | 430 | Stack Map Usage | 
|  | 431 | =============== | 
|  | 432 |  | 
|  | 433 | The stack map support described in this document can be used to | 
|  | 434 | precisely determine the location of values at a specific position in | 
|  | 435 | the code. LLVM does not maintain any mapping between those values and | 
|  | 436 | any higher-level entity. The runtime must be able to interpret the | 
|  | 437 | stack map record given only the ID, offset, and the order of the | 
| Sanjoy Das | 23f06e5 | 2016-09-14 20:22:03 +0000 | [diff] [blame] | 438 | locations, records, and functions, which LLVM preserves. | 
| Andrew Trick | 5e029ce | 2013-12-24 02:57:25 +0000 | [diff] [blame] | 439 |  | 
|  | 440 | Note that this is quite different from the goal of debug information, | 
|  | 441 | which is a best-effort attempt to track the location of named | 
|  | 442 | variables at every instruction. | 
|  | 443 |  | 
|  | 444 | An important motivation for this design is to allow a runtime to | 
|  | 445 | commandeer a stack frame when execution reaches an instruction address | 
|  | 446 | associated with a stack map. The runtime must be able to rebuild a | 
|  | 447 | stack frame and resume program execution using the information | 
|  | 448 | provided by the stack map. For example, execution may resume in an | 
|  | 449 | interpreter or a recompiled version of the same function. | 
|  | 450 |  | 
|  | 451 | This usage restricts LLVM optimization. Clearly, LLVM must not move | 
|  | 452 | stores across a stack map. However, loads must also be handled | 
|  | 453 | conservatively. If the load may trigger an exception, hoisting it | 
|  | 454 | above a stack map could be invalid. For example, the runtime may | 
|  | 455 | determine that a load is safe to execute without a type check given | 
|  | 456 | the current state of the type system. If the type system changes while | 
|  | 457 | some activation of the load's function exists on the stack, the load | 
|  | 458 | becomes unsafe. The runtime can prevent subsequent execution of that | 
|  | 459 | load by immediately patching any stack map location that lies between | 
|  | 460 | the current call site and the load (typically, the runtime would | 
|  | 461 | simply patch all stack map locations to invalidate the function). If | 
|  | 462 | the compiler had hoisted the load above the stack map, then the | 
|  | 463 | program could crash before the runtime could take back control. | 
|  | 464 |  | 
|  | 465 | To enforce these semantics, stackmap and patchpoint intrinsics are | 
|  | 466 | considered to potentially read and write all memory. This may limit | 
| Andrew Trick | f51ee3c | 2014-04-03 07:03:28 +0000 | [diff] [blame] | 467 | optimization more than some clients desire. This limitation may be | 
|  | 468 | avoided by marking the call site as "readonly". In the future we may | 
|  | 469 | also allow meta-data to be added to the intrinsic call to express | 
|  | 470 | aliasing, thereby allowing optimizations to hoist certain loads above | 
|  | 471 | stack maps. | 
| Andrew Trick | 5e029ce | 2013-12-24 02:57:25 +0000 | [diff] [blame] | 472 |  | 
|  | 473 | Direct Stack Map Entries | 
|  | 474 | ^^^^^^^^^^^^^^^^^^^^^^^^ | 
|  | 475 |  | 
|  | 476 | As shown in :ref:`stackmap-section`, a Direct stack map location | 
|  | 477 | records the address of frame index. This address is itself the value | 
|  | 478 | that the runtime requested. This differs from Indirect locations, | 
|  | 479 | which refer to a stack locations from which the requested values must | 
|  | 480 | be loaded. Direct locations can communicate the address if an alloca, | 
|  | 481 | while Indirect locations handle register spills. | 
|  | 482 |  | 
|  | 483 | For example: | 
|  | 484 |  | 
|  | 485 | .. code-block:: none | 
|  | 486 |  | 
|  | 487 | entry: | 
|  | 488 | %a = alloca i64... | 
|  | 489 | llvm.experimental.stackmap(i64 <ID>, i32 <shadowBytes>, i64* %a) | 
|  | 490 |  | 
|  | 491 | The runtime can determine this alloca's relative location on the | 
|  | 492 | stack immediately after compilation, or at any time thereafter. This | 
|  | 493 | differs from Register and Indirect locations, because the runtime can | 
|  | 494 | only read the values in those locations when execution reaches the | 
|  | 495 | instruction address of the stack map. | 
|  | 496 |  | 
|  | 497 | This functionality requires LLVM to treat entry-block allocas | 
|  | 498 | specially when they are directly consumed by an intrinsics. (This is | 
|  | 499 | the same requirement imposed by the llvm.gcroot intrinsic.) LLVM | 
|  | 500 | transformations must not substitute the alloca with any intervening | 
|  | 501 | value. This can be verified by the runtime simply by checking that the | 
|  | 502 | stack map's location is a Direct location type. | 
| Philip Reames | b773631 | 2015-07-16 21:10:46 +0000 | [diff] [blame] | 503 |  | 
|  | 504 |  | 
|  | 505 | Supported Architectures | 
|  | 506 | ======================= | 
|  | 507 |  | 
|  | 508 | Support for StackMap generation and the related intrinsics requires | 
|  | 509 | some code for each backend.  Today, only a subset of LLVM's backends | 
|  | 510 | are supported.  The currently supported architectures are X86_64, | 
|  | 511 | PowerPC, and Aarch64. |