Alexey Samsonov | 778fc72 | 2015-12-04 17:30:29 +0000 | [diff] [blame] | 1 | ========================== |
| 2 | UndefinedBehaviorSanitizer |
| 3 | ========================== |
| 4 | |
| 5 | .. contents:: |
| 6 | :local: |
| 7 | |
| 8 | Introduction |
| 9 | ============ |
| 10 | |
| 11 | UndefinedBehaviorSanitizer (UBSan) is a fast undefined behavior detector. |
| 12 | UBSan modifies the program at compile-time to catch various kinds of undefined |
| 13 | behavior during program execution, for example: |
| 14 | |
| 15 | * Using misaligned or null pointer |
| 16 | * Signed integer overflow |
| 17 | * Conversion to, from, or between floating-point types which would |
| 18 | overflow the destination |
| 19 | |
| 20 | See the full list of available :ref:`checks <ubsan-checks>` below. |
| 21 | |
| 22 | UBSan has an optional run-time library which provides better error reporting. |
| 23 | The checks have small runtime cost and no impact on address space layout or ABI. |
| 24 | |
| 25 | How to build |
| 26 | ============ |
| 27 | |
| 28 | Build LLVM/Clang with `CMake <http://llvm.org/docs/CMake.html>`_. |
| 29 | |
| 30 | Usage |
| 31 | ===== |
| 32 | |
| 33 | Use ``clang++`` to compile and link your program with ``-fsanitize=undefined`` |
| 34 | flag. Make sure to use ``clang++`` (not ``ld``) as a linker, so that your |
| 35 | executable is linked with proper UBSan runtime libraries. You can use ``clang`` |
| 36 | instead of ``clang++`` if you're compiling/linking C code. |
| 37 | |
| 38 | .. code-block:: console |
| 39 | |
| 40 | % cat test.cc |
| 41 | int main(int argc, char **argv) { |
| 42 | int k = 0x7fffffff; |
| 43 | k += argc; |
| 44 | return 0; |
| 45 | } |
| 46 | % clang++ -fsanitize=undefined test.cc |
| 47 | % ./a.out |
| 48 | test.cc:3:5: runtime error: signed integer overflow: 2147483647 + 1 cannot be represented in type 'int' |
| 49 | |
| 50 | You can enable only a subset of :ref:`checks <ubsan-checks>` offered by UBSan, |
| 51 | and define the desired behavior for each kind of check: |
| 52 | |
Vedant Kumar | 3cbce5d | 2017-03-20 21:40:58 +0000 | [diff] [blame] | 53 | * ``-fsanitize=...``: print a verbose error report and continue execution (default); |
| 54 | * ``-fno-sanitize-recover=...``: print a verbose error report and exit the program; |
| 55 | * ``-fsanitize-trap=...``: execute a trap instruction (doesn't require UBSan run-time support). |
Alexey Samsonov | 778fc72 | 2015-12-04 17:30:29 +0000 | [diff] [blame] | 56 | |
| 57 | For example if you compile/link your program as: |
| 58 | |
| 59 | .. code-block:: console |
| 60 | |
| 61 | % clang++ -fsanitize=signed-integer-overflow,null,alignment -fno-sanitize-recover=null -fsanitize-trap=alignment |
| 62 | |
| 63 | the program will continue execution after signed integer overflows, exit after |
| 64 | the first invalid use of a null pointer, and trap after the first use of misaligned |
| 65 | pointer. |
| 66 | |
| 67 | .. _ubsan-checks: |
| 68 | |
Nick Lewycky | d8d4986 | 2016-09-20 18:37:25 +0000 | [diff] [blame] | 69 | Available checks |
| 70 | ================ |
Alexey Samsonov | 778fc72 | 2015-12-04 17:30:29 +0000 | [diff] [blame] | 71 | |
| 72 | Available checks are: |
| 73 | |
| 74 | - ``-fsanitize=alignment``: Use of a misaligned pointer or creation |
| 75 | of a misaligned reference. |
| 76 | - ``-fsanitize=bool``: Load of a ``bool`` value which is neither |
| 77 | ``true`` nor ``false``. |
| 78 | - ``-fsanitize=bounds``: Out of bounds array indexing, in cases |
| 79 | where the array bound can be statically determined. |
| 80 | - ``-fsanitize=enum``: Load of a value of an enumerated type which |
| 81 | is not in the range of representable values for that enumerated |
| 82 | type. |
| 83 | - ``-fsanitize=float-cast-overflow``: Conversion to, from, or |
| 84 | between floating-point types which would overflow the |
| 85 | destination. |
| 86 | - ``-fsanitize=float-divide-by-zero``: Floating point division by |
| 87 | zero. |
| 88 | - ``-fsanitize=function``: Indirect call of a function through a |
| 89 | function pointer of the wrong type (Linux, C++ and x86/x86_64 only). |
| 90 | - ``-fsanitize=integer-divide-by-zero``: Integer division by zero. |
| 91 | - ``-fsanitize=nonnull-attribute``: Passing null pointer as a function |
| 92 | parameter which is declared to never be null. |
| 93 | - ``-fsanitize=null``: Use of a null pointer or creation of a null |
| 94 | reference. |
Vedant Kumar | 42c17ec | 2017-03-14 01:56:34 +0000 | [diff] [blame] | 95 | - ``-fsanitize=nullability-arg``: Passing null as a function parameter |
| 96 | which is annotated with ``_Nonnull``. |
| 97 | - ``-fsanitize=nullability-assign``: Assigning null to an lvalue which |
| 98 | is annotated with ``_Nonnull``. |
| 99 | - ``-fsanitize=nullability-return``: Returning null from a function with |
| 100 | a return type annotated with ``_Nonnull``. |
George Burgess IV | 58ebc66 | 2016-04-25 19:21:45 +0000 | [diff] [blame] | 101 | - ``-fsanitize=object-size``: An attempt to potentially use bytes which |
George Burgess IV | a17674b | 2016-04-26 00:31:29 +0000 | [diff] [blame] | 102 | the optimizer can determine are not part of the object being accessed. |
| 103 | This will also detect some types of undefined behavior that may not |
| 104 | directly access memory, but are provably incorrect given the size of |
| 105 | the objects involved, such as invalid downcasts and calling methods on |
| 106 | invalid pointers. These checks are made in terms of |
| 107 | ``__builtin_object_size``, and consequently may be able to detect more |
| 108 | problems at higher optimization levels. |
Vedant Kumar | a125eb5 | 2017-06-01 19:22:18 +0000 | [diff] [blame^] | 109 | - ``-fsanitize=pointer-overflow``: Performing pointer arithmetic which |
| 110 | overflows. |
Alexey Samsonov | 778fc72 | 2015-12-04 17:30:29 +0000 | [diff] [blame] | 111 | - ``-fsanitize=return``: In C++, reaching the end of a |
| 112 | value-returning function without returning a value. |
| 113 | - ``-fsanitize=returns-nonnull-attribute``: Returning null pointer |
| 114 | from a function which is declared to never return null. |
| 115 | - ``-fsanitize=shift``: Shift operators where the amount shifted is |
| 116 | greater or equal to the promoted bit-width of the left hand side |
| 117 | or less than zero, or where the left hand side is negative. For a |
| 118 | signed left shift, also checks for signed overflow in C, and for |
| 119 | unsigned overflow in C++. You can use ``-fsanitize=shift-base`` or |
| 120 | ``-fsanitize=shift-exponent`` to check only left-hand side or |
| 121 | right-hand side of shift operation, respectively. |
| 122 | - ``-fsanitize=signed-integer-overflow``: Signed integer overflow, |
| 123 | including all the checks added by ``-ftrapv``, and checking for |
| 124 | overflow in signed division (``INT_MIN / -1``). |
| 125 | - ``-fsanitize=unreachable``: If control flow reaches |
| 126 | ``__builtin_unreachable``. |
| 127 | - ``-fsanitize=unsigned-integer-overflow``: Unsigned integer |
Nico Weber | 614e60d | 2017-02-27 21:27:07 +0000 | [diff] [blame] | 128 | overflows. Note that unlike signed integer overflow, unsigned integer |
| 129 | is not undefined behavior. However, while it has well-defined semantics, |
| 130 | it is often unintentional, so UBSan offers to catch it. |
Alexey Samsonov | 778fc72 | 2015-12-04 17:30:29 +0000 | [diff] [blame] | 131 | - ``-fsanitize=vla-bound``: A variable-length array whose bound |
| 132 | does not evaluate to a positive value. |
| 133 | - ``-fsanitize=vptr``: Use of an object whose vptr indicates that |
| 134 | it is of the wrong dynamic type, or that its lifetime has not |
Alexey Samsonov | b6761c2 | 2015-12-04 23:13:14 +0000 | [diff] [blame] | 135 | begun or has ended. Incompatible with ``-fno-rtti``. Link must |
| 136 | be performed by ``clang++``, not ``clang``, to make sure C++-specific |
| 137 | parts of the runtime library and C++ standard libraries are present. |
Alexey Samsonov | 778fc72 | 2015-12-04 17:30:29 +0000 | [diff] [blame] | 138 | |
| 139 | You can also use the following check groups: |
| 140 | - ``-fsanitize=undefined``: All of the checks listed above other than |
Vedant Kumar | 42c17ec | 2017-03-14 01:56:34 +0000 | [diff] [blame] | 141 | ``unsigned-integer-overflow`` and the ``nullability-*`` checks. |
Alexey Samsonov | 778fc72 | 2015-12-04 17:30:29 +0000 | [diff] [blame] | 142 | - ``-fsanitize=undefined-trap``: Deprecated alias of |
| 143 | ``-fsanitize=undefined``. |
| 144 | - ``-fsanitize=integer``: Checks for undefined or suspicious integer |
| 145 | behavior (e.g. unsigned integer overflow). |
Vedant Kumar | 42c17ec | 2017-03-14 01:56:34 +0000 | [diff] [blame] | 146 | - ``-fsanitize=nullability``: Enables ``nullability-arg``, |
| 147 | ``nullability-assign``, and ``nullability-return``. While violating |
| 148 | nullability does not have undefined behavior, it is often unintentional, |
| 149 | so UBSan offers to catch it. |
Alexey Samsonov | 778fc72 | 2015-12-04 17:30:29 +0000 | [diff] [blame] | 150 | |
| 151 | Stack traces and report symbolization |
| 152 | ===================================== |
| 153 | If you want UBSan to print symbolized stack trace for each error report, you |
| 154 | will need to: |
| 155 | |
| 156 | #. Compile with ``-g`` and ``-fno-omit-frame-pointer`` to get proper debug |
| 157 | information in your binary. |
| 158 | #. Run your program with environment variable |
| 159 | ``UBSAN_OPTIONS=print_stacktrace=1``. |
| 160 | #. Make sure ``llvm-symbolizer`` binary is in ``PATH``. |
| 161 | |
| 162 | Issue Suppression |
| 163 | ================= |
| 164 | |
| 165 | UndefinedBehaviorSanitizer is not expected to produce false positives. |
| 166 | If you see one, look again; most likely it is a true positive! |
| 167 | |
| 168 | Disabling Instrumentation with ``__attribute__((no_sanitize("undefined")))`` |
| 169 | ---------------------------------------------------------------------------- |
| 170 | |
| 171 | You disable UBSan checks for particular functions with |
| 172 | ``__attribute__((no_sanitize("undefined")))``. You can use all values of |
| 173 | ``-fsanitize=`` flag in this attribute, e.g. if your function deliberately |
| 174 | contains possible signed integer overflow, you can use |
| 175 | ``__attribute__((no_sanitize("signed-integer-overflow")))``. |
| 176 | |
| 177 | This attribute may not be |
| 178 | supported by other compilers, so consider using it together with |
| 179 | ``#if defined(__clang__)``. |
| 180 | |
| 181 | Suppressing Errors in Recompiled Code (Blacklist) |
| 182 | ------------------------------------------------- |
| 183 | |
| 184 | UndefinedBehaviorSanitizer supports ``src`` and ``fun`` entity types in |
| 185 | :doc:`SanitizerSpecialCaseList`, that can be used to suppress error reports |
| 186 | in the specified source files or functions. |
| 187 | |
Alexey Samsonov | 7f5b2d0 | 2016-01-29 23:07:14 +0000 | [diff] [blame] | 188 | Runtime suppressions |
| 189 | -------------------- |
| 190 | |
| 191 | Sometimes you can suppress UBSan error reports for specific files, functions, |
| 192 | or libraries without recompiling the code. You need to pass a path to |
| 193 | suppression file in a ``UBSAN_OPTIONS`` environment variable. |
| 194 | |
| 195 | .. code-block:: bash |
| 196 | |
| 197 | UBSAN_OPTIONS=suppressions=MyUBSan.supp |
| 198 | |
| 199 | You need to specify a :ref:`check <ubsan-checks>` you are suppressing and the |
| 200 | bug location. For example: |
| 201 | |
| 202 | .. code-block:: bash |
| 203 | |
| 204 | signed-integer-overflow:file-with-known-overflow.cpp |
| 205 | alignment:function_doing_unaligned_access |
| 206 | vptr:shared_object_with_vptr_failures.so |
| 207 | |
| 208 | There are several limitations: |
| 209 | |
| 210 | * Sometimes your binary must have enough debug info and/or symbol table, so |
| 211 | that the runtime could figure out source file or function name to match |
| 212 | against the suppression. |
| 213 | * It is only possible to suppress recoverable checks. For the example above, |
| 214 | you can additionally pass |
| 215 | ``-fsanitize-recover=signed-integer-overflow,alignment,vptr``, although |
| 216 | most of UBSan checks are recoverable by default. |
| 217 | * Check groups (like ``undefined``) can't be used in suppressions file, only |
| 218 | fine-grained checks are supported. |
| 219 | |
Alexey Samsonov | 778fc72 | 2015-12-04 17:30:29 +0000 | [diff] [blame] | 220 | Supported Platforms |
| 221 | =================== |
| 222 | |
| 223 | UndefinedBehaviorSanitizer is supported on the following OS: |
| 224 | |
| 225 | * Android |
| 226 | * Linux |
| 227 | * FreeBSD |
| 228 | * OS X 10.6 onwards |
| 229 | |
| 230 | and for the following architectures: |
| 231 | |
| 232 | * i386/x86\_64 |
| 233 | * ARM |
| 234 | * AArch64 |
| 235 | * PowerPC64 |
| 236 | * MIPS/MIPS64 |
| 237 | |
| 238 | Current Status |
| 239 | ============== |
| 240 | |
| 241 | UndefinedBehaviorSanitizer is available on selected platforms starting from LLVM |
| 242 | 3.3. The test suite is integrated into the CMake build and can be run with |
| 243 | ``check-ubsan`` command. |
| 244 | |
Filipe Cabecinhas | ab731f7 | 2016-05-12 16:51:36 +0000 | [diff] [blame] | 245 | Additional Configuration |
| 246 | ======================== |
| 247 | |
| 248 | UndefinedBehaviorSanitizer adds static check data for each check unless it is |
| 249 | in trap mode. This check data includes the full file name. The option |
| 250 | ``-fsanitize-undefined-strip-path-components=N`` can be used to trim this |
| 251 | information. If ``N`` is positive, file information emitted by |
| 252 | UndefinedBehaviorSanitizer will drop the first ``N`` components from the file |
| 253 | path. If ``N`` is negative, the last ``N`` components will be kept. |
| 254 | |
| 255 | Example |
| 256 | ------- |
| 257 | |
| 258 | For a file called ``/code/library/file.cpp``, here is what would be emitted: |
| 259 | * Default (No flag, or ``-fsanitize-undefined-strip-path-components=0``): ``/code/library/file.cpp`` |
| 260 | * ``-fsanitize-undefined-strip-path-components=1``: ``code/library/file.cpp`` |
| 261 | * ``-fsanitize-undefined-strip-path-components=2``: ``library/file.cpp`` |
| 262 | * ``-fsanitize-undefined-strip-path-components=-1``: ``file.cpp`` |
| 263 | * ``-fsanitize-undefined-strip-path-components=-2``: ``library/file.cpp`` |
| 264 | |
Alexey Samsonov | 778fc72 | 2015-12-04 17:30:29 +0000 | [diff] [blame] | 265 | More Information |
| 266 | ================ |
| 267 | |
| 268 | * From LLVM project blog: |
| 269 | `What Every C Programmer Should Know About Undefined Behavior |
| 270 | <http://blog.llvm.org/2011/05/what-every-c-programmer-should-know.html>`_ |
| 271 | * From John Regehr's *Embedded in Academia* blog: |
| 272 | `A Guide to Undefined Behavior in C and C++ |
| 273 | <http://blog.regehr.org/archives/213>`_ |