blob: da609dcd9de47404beda5663b6c39ba52200a2ab [file] [log] [blame]
Vlad Tsyrkleviche55aa032018-04-03 22:33:53 +00001===============
2ShadowCallStack
3===============
4
5.. contents::
6 :local:
7
8Introduction
9============
10
11ShadowCallStack is an **experimental** instrumentation pass, currently only
Peter Collingbournef11eb3e2018-04-04 21:55:44 +000012implemented for x86_64 and aarch64, that protects programs against return
13address overwrites (e.g. stack buffer overflows.) It works by saving a
14function's return address to a separately allocated 'shadow call stack'
15in the function prolog and checking the return address on the stack against
16the shadow call stack in the function epilog.
Vlad Tsyrkleviche55aa032018-04-03 22:33:53 +000017
18Comparison
19----------
20
21To optimize for memory consumption and cache locality, the shadow call stack
22stores an index followed by an array of return addresses. This is in contrast
23to other schemes, like :doc:`SafeStack`, that mirror the entire stack and
24trade-off consuming more memory for shorter function prologs and epilogs with
25fewer memory accesses. Similarly, `Return Flow Guard`_ consumes more memory with
26shorter function prologs and epilogs than ShadowCallStack but suffers from the
27same race conditions (see `Security`_). Intel `Control-flow Enforcement Technology`_
28(CET) is a proposed hardware extension that would add native support to
29use a shadow stack to store/check return addresses at call/return time. It
30would not suffer from race conditions at calls and returns and not incur the
31overhead of function instrumentation, but it does require operating system
32support.
33
34.. _`Return Flow Guard`: https://xlab.tencent.com/en/2016/11/02/return-flow-guard/
35.. _`Control-flow Enforcement Technology`: https://software.intel.com/sites/default/files/managed/4d/2a/control-flow-enforcement-technology-preview.pdf
36
37Compatibility
38-------------
39
Peter Collingbournef11eb3e2018-04-04 21:55:44 +000040ShadowCallStack currently only supports x86_64 and aarch64. A runtime is not
41currently provided in compiler-rt so one must be provided by the compiled
42application.
43
44On aarch64, the instrumentation makes use of the platform register ``x18``.
45On some platforms, ``x18`` is reserved, and on others, it is designated as
46a scratch register. This generally means that any code that may run on the
47same thread as code compiled with ShadowCallStack must either target one
48of the platforms whose ABI reserves ``x18`` (currently Darwin, Fuchsia and
49Windows) or be compiled with the flag ``-ffixed-x18``.
Vlad Tsyrkleviche55aa032018-04-03 22:33:53 +000050
51Security
52========
53
54ShadowCallStack is intended to be a stronger alternative to
55``-fstack-protector``. It protects from non-linear overflows and arbitrary
56memory writes to the return address slot; however, similarly to
57``-fstack-protector`` this protection suffers from race conditions because of
58the call-return semantics on x86_64. There is a short race between the call
59instruction and the first instruction in the function that reads the return
60address where an attacker could overwrite the return address and bypass
61ShadowCallStack. Similarly, there is a time-of-check-to-time-of-use race in the
62function epilog where an attacker could overwrite the return address after it
63has been checked and before it has been returned to. Modifying the call-return
64semantics to fix this on x86_64 would incur an unacceptable performance overhead
65due to return branch prediction.
66
Peter Collingbournef11eb3e2018-04-04 21:55:44 +000067The instrumentation makes use of the ``gs`` segment register on x86_64,
68or the ``x18`` register on aarch64, to reference the shadow call stack
69meaning that references to the shadow call stack do not have to be stored in
70memory. This makes it possible to implement a runtime that avoids exposing
71the address of the shadow call stack to attackers that can read arbitrary
72memory. However, attackers could still try to exploit side channels exposed
73by the operating system `[1]`_ `[2]`_ or processor `[3]`_ to discover the
74address of the shadow call stack.
Vlad Tsyrkleviche55aa032018-04-03 22:33:53 +000075
76.. _`[1]`: https://eyalitkin.wordpress.com/2017/09/01/cartography-lighting-up-the-shadows/
77.. _`[2]`: https://www.blackhat.com/docs/eu-16/materials/eu-16-Goktas-Bypassing-Clangs-SafeStack.pdf
78.. _`[3]`: https://www.vusec.net/projects/anc/
79
Peter Collingbournef11eb3e2018-04-04 21:55:44 +000080On x86_64, leaf functions are optimized to store the return address in a
81free register and avoid writing to the shadow call stack if a register is
82available. Very short leaf functions are uninstrumented if their execution
83is judged to be shorter than the race condition window intrinsic to the
84instrumentation.
85
86On aarch64, the architecture's call and return instructions (``bl`` and
87``ret``) operate on a register rather than the stack, which means that
88leaf functions are generally protected from return address overwrites even
89without ShadowCallStack. It also means that ShadowCallStack on aarch64 is not
90vulnerable to the same types of time-of-check-to-time-of-use races as x86_64.
Vlad Tsyrkleviche55aa032018-04-03 22:33:53 +000091
92Usage
93=====
94
Peter Collingbournef11eb3e2018-04-04 21:55:44 +000095To enable ShadowCallStack, just pass the ``-fsanitize=shadow-call-stack``
96flag to both compile and link command lines. On aarch64, you also need to pass
97``-ffixed-x18`` unless your target already reserves ``x18``.
Vlad Tsyrkleviche55aa032018-04-03 22:33:53 +000098
99Low-level API
100-------------
101
102``__has_feature(shadow_call_stack)``
103~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
104
105In some cases one may need to execute different code depending on whether
106ShadowCallStack is enabled. The macro ``__has_feature(shadow_call_stack)`` can
107be used for this purpose.
108
109.. code-block:: c
110
111 #if defined(__has_feature)
112 # if __has_feature(shadow_call_stack)
113 // code that builds only under ShadowCallStack
114 # endif
115 #endif
116
117``__attribute__((no_sanitize("shadow-call-stack")))``
118~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
119
120Use ``__attribute__((no_sanitize("shadow-call-stack")))`` on a function
121declaration to specify that the shadow call stack instrumentation should not be
122applied to that function, even if enabled globally.
123
124Example
125=======
126
127The following example code:
128
129.. code-block:: c++
130
131 int foo() {
132 return bar() + 1;
133 }
134
135Generates the following x86_64 assembly when compiled with ``-O2``:
136
137.. code-block:: gas
138
139 push %rax
Kostya Serebryanyd5dc8192018-05-01 00:15:56 +0000140 callq bar
Vlad Tsyrkleviche55aa032018-04-03 22:33:53 +0000141 add $0x1,%eax
142 pop %rcx
143 retq
144
Peter Collingbournef11eb3e2018-04-04 21:55:44 +0000145or the following aarch64 assembly:
146
147.. code-block:: none
148
149 stp x29, x30, [sp, #-16]!
150 mov x29, sp
151 bl bar
152 add w0, w0, #1
153 ldp x29, x30, [sp], #16
154 ret
155
156
157Adding ``-fsanitize=shadow-call-stack`` would output the following x86_64
158assembly:
Vlad Tsyrkleviche55aa032018-04-03 22:33:53 +0000159
160.. code-block:: gas
161
162 mov (%rsp),%r10
163 xor %r11,%r11
164 addq $0x8,%gs:(%r11)
165 mov %gs:(%r11),%r11
166 mov %r10,%gs:(%r11)
167 push %rax
Kostya Serebryanyd5dc8192018-05-01 00:15:56 +0000168 callq bar
Vlad Tsyrkleviche55aa032018-04-03 22:33:53 +0000169 add $0x1,%eax
170 pop %rcx
171 xor %r11,%r11
172 mov %gs:(%r11),%r10
173 mov %gs:(%r10),%r10
174 subq $0x8,%gs:(%r11)
175 cmp %r10,(%rsp)
176 jne trap
177 retq
178
179 trap:
180 ud2
Peter Collingbournef11eb3e2018-04-04 21:55:44 +0000181
182or the following aarch64 assembly:
183
184.. code-block:: none
185
186 str x30, [x18], #8
187 stp x29, x30, [sp, #-16]!
188 mov x29, sp
189 bl bar
190 add w0, w0, #1
191 ldp x29, x30, [sp], #16
192 ldr x30, [x18, #-8]!
193 ret