| Evgeniy Stepanov | cc603e9 | 2012-12-21 10:50:00 +0000 | [diff] [blame] | 1 | ================ | 
 | 2 | MemorySanitizer | 
 | 3 | ================ | 
 | 4 |  | 
 | 5 | .. contents:: | 
 | 6 |    :local: | 
 | 7 |  | 
 | 8 | Introduction | 
 | 9 | ============ | 
 | 10 |  | 
 | 11 | MemorySanitizer is a detector of uninitialized reads. It consists of a | 
 | 12 | compiler instrumentation module and a run-time library. | 
 | 13 |  | 
 | 14 | Typical slowdown introduced by MemorySanitizer is **3x**. | 
 | 15 |  | 
 | 16 | How to build | 
 | 17 | ============ | 
 | 18 |  | 
 | 19 | Follow the `clang build instructions <../get_started.html>`_. CMake | 
 | 20 | build is supported. | 
 | 21 |  | 
 | 22 | Usage | 
 | 23 | ===== | 
 | 24 |  | 
 | 25 | Simply compile and link your program with ``-fsanitize=memory`` flag. | 
 | 26 | The MemorySanitizer run-time library should be linked to the final | 
 | 27 | executable, so make sure to use ``clang`` (not ``ld``) for the final | 
 | 28 | link step. When linking shared libraries, the MemorySanitizer run-time | 
 | 29 | is not linked, so ``-Wl,-z,defs`` may cause link errors (don't use it | 
 | 30 | with MemorySanitizer). To get a reasonable performance add ``-O1`` or | 
 | 31 | higher. To get meaninful stack traces in error messages add | 
 | 32 | ``-fno-omit-frame-pointer``. To get perfect stack traces you may need | 
 | 33 | to disable inlining (just use ``-O1``) and tail call elimination | 
 | 34 | (``-fno-optimize-sibling-calls``). | 
 | 35 |  | 
 | 36 | .. code-block:: console | 
| Dmitri Gribenko | 184e1c4 | 2012-12-23 18:36:44 +0000 | [diff] [blame] | 37 |  | 
| Evgeniy Stepanov | cc603e9 | 2012-12-21 10:50:00 +0000 | [diff] [blame] | 38 |     % cat umr.cc | 
 | 39 |     #include <stdio.h> | 
 | 40 |  | 
 | 41 |     int main(int argc, char** argv) { | 
 | 42 |       int* a = new int[10]; | 
 | 43 |       a[5] = 0; | 
 | 44 |       if (a[argc]) | 
 | 45 |         printf("xx\n"); | 
 | 46 |       return 0; | 
 | 47 |     } | 
 | 48 |  | 
 | 49 |     % clang -fsanitize=memory -fPIE -pie -fno-omit-frame-pointer -g -O2 umr.cc | 
 | 50 |  | 
 | 51 | If a bug is detected, the program will print an error message to | 
 | 52 | stderr and exit with a non-zero exit code. Currently, MemorySanitizer | 
 | 53 | does not symbolize its output by default, so you may need to use a | 
 | 54 | separate script to symbolize the result offline (this will be fixed in | 
 | 55 | future). | 
 | 56 |  | 
 | 57 | .. code-block:: console | 
 | 58 |  | 
 | 59 |     % ./a.out 2>log | 
 | 60 |     % projects/compiler-rt/lib/asan/scripts/asan_symbolize.py / < log | c++filt | 
 | 61 |     ==30106==  WARNING: MemorySanitizer: UMR (uninitialized-memory-read) | 
 | 62 |         #0 0x7f45944b418a in main umr.cc:6 | 
 | 63 |         #1 0x7f45938b676c in __libc_start_main libc-start.c:226 | 
 | 64 |     Exiting | 
 | 65 |  | 
 | 66 | By default, MemorySanitizer exits on the first detected error. | 
 | 67 |  | 
 | 68 | ``__has_feature(memory_sanitizer)`` | 
 | 69 | ------------------------------------ | 
 | 70 |  | 
 | 71 | In some cases one may need to execute different code depending on | 
 | 72 | whether MemorySanitizer is enabled. :ref:`\_\_has\_feature | 
 | 73 | <langext-__has_feature-__has_extension>` can be used for this purpose. | 
 | 74 |  | 
 | 75 | .. code-block:: c | 
 | 76 |  | 
 | 77 |     #if defined(__has_feature) | 
 | 78 |     #  if __has_feature(memory_sanitizer) | 
 | 79 |     // code that builds only under MemorySanitizer | 
 | 80 |     #  endif | 
 | 81 |     #endif | 
 | 82 |  | 
 | 83 | Origin Tracking | 
 | 84 | =============== | 
 | 85 |  | 
 | 86 | MemorySanitizer can track origins of unitialized values, similar to | 
 | 87 | Valgrind's --track-origins option. This feature is enabled by | 
 | 88 | ``-fsanitize-memory-track-origins`` Clang option. With the code from | 
 | 89 | the example above, | 
 | 90 |  | 
 | 91 | .. code-block:: console | 
 | 92 |  | 
 | 93 |     % clang -fsanitize=memory -fsanitize-memory-track-origins -fPIE -pie -fno-omit-frame-pointer -g -O2 umr.cc | 
 | 94 |     % ./a.out 2>log | 
 | 95 |     % projects/compiler-rt/lib/asan/scripts/asan_symbolize.py / < log | c++filt | 
 | 96 |     ==14425==  WARNING: MemorySanitizer: UMR (uninitialized-memory-read) | 
 | 97 |     ==14425== WARNING: Trying to symbolize code, but external symbolizer is not initialized! | 
 | 98 |         #0 0x7f8bdda3824b in main umr.cc:6 | 
 | 99 |         #1 0x7f8bdce3a76c in __libc_start_main libc-start.c:226 | 
 | 100 |       raw origin id: 2030043137 | 
 | 101 |       ORIGIN: heap allocation: | 
 | 102 |         #0 0x7f8bdda4034b in operator new[](unsigned long) msan_new_delete.cc:39 | 
 | 103 |         #1 0x7f8bdda3814d in main umr.cc:4 | 
 | 104 |         #2 0x7f8bdce3a76c in __libc_start_main libc-start.c:226 | 
 | 105 |     Exiting | 
 | 106 |  | 
 | 107 | Origin tracking has proved to be very useful for debugging UMR | 
 | 108 | reports. It slows down program execution by a factor of 1.5x-2x on top | 
 | 109 | of the usual MemorySanitizer slowdown. | 
 | 110 |  | 
 | 111 | Handling external code | 
 | 112 | ============================ | 
 | 113 |  | 
 | 114 | MemorySanitizer requires that all program code is instrumented. This | 
 | 115 | also includes any libraries that the program depends on, even libc. | 
 | 116 | Failing to achieve this may result in false UMR reports. | 
 | 117 |  | 
 | 118 | Full MemorySanitizer instrumentation is very difficult to achieve. To | 
 | 119 | make it easier, MemorySanitizer runtime library includes 70+ | 
 | 120 | interceptors for the most common libc functions. They make it possible | 
 | 121 | to run MemorySanitizer-instrumented programs linked with | 
 | 122 | uninstrumented libc. For example, the authors were able to bootstrap | 
 | 123 | MemorySanitizer-instrumented Clang compiler by linking it with | 
 | 124 | self-built instrumented libcxx (as a replacement for libstdc++). | 
 | 125 |  | 
 | 126 | In the case when rebuilding all program dependencies with | 
 | 127 | MemorySanitizer is problematic, an experimental MSanDR tool can be | 
 | 128 | used. It is a DynamoRio-based tool that uses dynamic instrumentation | 
 | 129 | to avoid false positives due to uninstrumented code. The tool simply | 
 | 130 | marks memory from instrumented libraries as fully initialized. See | 
 | 131 | `http://code.google.com/p/memory-sanitizer/wiki/Running#Running_with_the_dynamic_tool` | 
 | 132 | for more information. | 
 | 133 |  | 
 | 134 | Supported Platforms | 
 | 135 | =================== | 
 | 136 |  | 
 | 137 | MemorySanitizer is supported on | 
 | 138 |  | 
 | 139 | * Linux x86\_64 (tested on Ubuntu 10.04 and 12.04); | 
 | 140 |  | 
 | 141 | Limitations | 
 | 142 | =========== | 
 | 143 |  | 
 | 144 | * MemorySanitizer uses 2x more real memory than a native run, 3x with | 
 | 145 |   origin tracking. | 
 | 146 | * MemorySanitizer maps (but not reserves) 64 Terabytes of virtual | 
 | 147 |   address space. This means that tools like ``ulimit`` may not work as | 
 | 148 |   usually expected. | 
 | 149 | * Static linking is not supported. | 
 | 150 | * Non-position-independent executables are not supported. | 
 | 151 | * Depending on the version of Linux kernel, running without ASLR may | 
 | 152 |   be not supported. Note that GDB disables ASLR by default. To debug | 
 | 153 |   instrumented programs, use "set disable-randomization off". | 
 | 154 |  | 
 | 155 | Current Status | 
 | 156 | ============== | 
 | 157 |  | 
 | 158 | MemorySanitizer is an experimental tool. It is known to work on large | 
 | 159 | real-world programs, like Clang/LLVM itself. | 
 | 160 |  | 
 | 161 | More Information | 
 | 162 | ================ | 
 | 163 |  | 
 | 164 | `http://code.google.com/p/memory-sanitizer <http://code.google.com/p/memory-sanitizer/>`_ | 
 | 165 |  |