Bill Wendling | 47997e8 | 2012-06-20 12:05:05 +0000 | [diff] [blame] | 1 | .. _segmented_stacks: |
| 2 | |
| 3 | ======================== |
| 4 | Segmented Stacks in LLVM |
| 5 | ======================== |
| 6 | |
| 7 | .. contents:: |
| 8 | :local: |
| 9 | |
| 10 | Introduction |
| 11 | ============ |
| 12 | |
| 13 | Segmented stack allows stack space to be allocated incrementally than as a |
| 14 | monolithic chunk (of some worst case size) at thread initialization. This is |
| 15 | done by allocating stack blocks (henceforth called *stacklets*) and linking them |
| 16 | into a doubly linked list. The function prologue is responsible for checking if |
| 17 | the current stacklet has enough space for the function to execute; and if not, |
| 18 | call into the libgcc runtime to allocate more stack space. When using ``llc``, |
| 19 | segmented stacks can be enabled by adding ``-segmented-stacks`` to the command |
| 20 | line. |
| 21 | |
| 22 | The runtime functionality is `already there in libgcc |
| 23 | <http://gcc.gnu.org/wiki/SplitStacks>`_. |
| 24 | |
| 25 | Implementation Details |
| 26 | ====================== |
| 27 | |
| 28 | .. _allocating stacklets: |
| 29 | |
| 30 | Allocating Stacklets |
| 31 | -------------------- |
| 32 | |
| 33 | As mentioned above, the function prologue checks if the current stacklet has |
| 34 | enough space. The current approach is to use a slot in the TCB to store the |
| 35 | current stack limit (minus the amount of space needed to allocate a new block) - |
| 36 | this slot's offset is again dictated by ``libgcc``. The generated |
| 37 | assembly looks like this on x86-64: |
| 38 | |
| 39 | .. code-block:: nasm |
| 40 | |
| 41 | leaq -8(%rsp), %r10 |
| 42 | cmpq %fs:112, %r10 |
| 43 | jg .LBB0_2 |
| 44 | |
| 45 | # More stack space needs to be allocated |
| 46 | movabsq $8, %r10 # The amount of space needed |
| 47 | movabsq $0, %r11 # The total size of arguments passed on stack |
| 48 | callq __morestack |
| 49 | ret # The reason for this extra return is explained below |
| 50 | .LBB0_2: |
| 51 | # Usual prologue continues here |
| 52 | |
| 53 | The size of function arguments on the stack needs to be passed to |
| 54 | ``__morestack`` (this function is implemented in ``libgcc``) since that number |
| 55 | of bytes has to be copied from the previous stacklet to the current one. This is |
| 56 | so that SP (and FP) relative addressing of function arguments work as expected. |
| 57 | |
| 58 | The unusual ``ret`` is needed to have the function which made a call to |
| 59 | ``__morestack`` return correctly. ``__morestack``, instead of returning, calls |
| 60 | into ``.LBB0_2``. This is possible since both, the size of the ``ret`` |
| 61 | instruction and the PC of call to ``__morestack`` are known. When the function |
| 62 | body returns, control is transferred back to ``__morestack``. ``__morestack`` |
| 63 | then de-allocates the new stacklet, restores the correct SP value, and does a |
| 64 | second return, which returns control to the correct caller. |
| 65 | |
| 66 | Variable Sized Allocas |
| 67 | ---------------------- |
| 68 | |
| 69 | The section on `allocating stacklets`_ automatically assumes that every stack |
| 70 | frame will be of fixed size. However, LLVM allows the use of the ``llvm.alloca`` |
| 71 | intrinsic to allocate dynamically sized blocks of memory on the stack. When |
| 72 | faced with such a variable-sized alloca, code is generated to: |
| 73 | |
| 74 | * Check if the current stacklet has enough space. If yes, just bump the SP, like |
| 75 | in the normal case. |
| 76 | * If not, generate a call to ``libgcc``, which allocates the memory from the |
| 77 | heap. |
| 78 | |
| 79 | The memory allocated from the heap is linked into a list in the current |
| 80 | stacklet, and freed along with the same. This prevents a memory leak. |