sewardj | 37a78a0 | 2008-10-23 13:15:23 +0000 | [diff] [blame] | 1 | <?xml version="1.0"?> <!-- -*- sgml -*- --> |
| 2 | <!DOCTYPE chapter PUBLIC "-//OASIS//DTD DocBook XML V4.2//EN" |
| 3 | "http://www.oasis-open.org/docbook/xml/4.2/docbookx.dtd" |
| 4 | [ <!ENTITY % vg-entities SYSTEM "../../docs/xml/vg-entities.xml"> %vg-entities; ]> |
| 5 | |
| 6 | |
sewardj | 3848827 | 2011-05-11 15:26:06 +0000 | [diff] [blame] | 7 | <chapter id="sg-manual" |
| 8 | xreflabel="SGCheck: an experimental stack and global array overrun detector"> |
| 9 | <title>SGCheck: an experimental stack and global array overrun detector</title> |
sewardj | 37a78a0 | 2008-10-23 13:15:23 +0000 | [diff] [blame] | 10 | |
| 11 | <para>To use this tool, you must specify |
sewardj | 3848827 | 2011-05-11 15:26:06 +0000 | [diff] [blame] | 12 | <option>--tool=exp-sgcheck</option> on the Valgrind |
sewardj | 37a78a0 | 2008-10-23 13:15:23 +0000 | [diff] [blame] | 13 | command line.</para> |
| 14 | |
| 15 | |
| 16 | |
| 17 | |
sewardj | 3848827 | 2011-05-11 15:26:06 +0000 | [diff] [blame] | 18 | <sect1 id="sg-manual.overview" xreflabel="Overview"> |
sewardj | 37a78a0 | 2008-10-23 13:15:23 +0000 | [diff] [blame] | 19 | <title>Overview</title> |
| 20 | |
sewardj | 3848827 | 2011-05-11 15:26:06 +0000 | [diff] [blame] | 21 | <para>SGCheck is a tool for finding overruns of stack and global |
| 22 | arrays. It works by using a heuristic approach derived from an |
| 23 | observation about the likely forms of stack and global array accesses. |
| 24 | </para> |
sewardj | 37a78a0 | 2008-10-23 13:15:23 +0000 | [diff] [blame] | 25 | |
| 26 | </sect1> |
| 27 | |
| 28 | |
| 29 | |
| 30 | |
sewardj | 3848827 | 2011-05-11 15:26:06 +0000 | [diff] [blame] | 31 | <sect1 id="sg-manual.options" xreflabel="SGCheck Command-line Options"> |
| 32 | <title>SGCheck Command-line Options</title> |
sewardj | 37a78a0 | 2008-10-23 13:15:23 +0000 | [diff] [blame] | 33 | |
philippe | 90285ba | 2013-03-10 16:20:10 +0000 | [diff] [blame] | 34 | <para id="sg.opts.list">There are no SGCheck-specific command-line options at present.</para> |
sewardj | 3848827 | 2011-05-11 15:26:06 +0000 | [diff] [blame] | 35 | <!-- |
| 36 | <para>SGCheck-specific command-line options are:</para> |
sewardj | 37a78a0 | 2008-10-23 13:15:23 +0000 | [diff] [blame] | 37 | |
sewardj | 37a78a0 | 2008-10-23 13:15:23 +0000 | [diff] [blame] | 38 | |
sewardj | 3848827 | 2011-05-11 15:26:06 +0000 | [diff] [blame] | 39 | <variablelist id="sg.opts.list"> |
sewardj | 37a78a0 | 2008-10-23 13:15:23 +0000 | [diff] [blame] | 40 | </variablelist> |
sewardj | 3848827 | 2011-05-11 15:26:06 +0000 | [diff] [blame] | 41 | --> |
sewardj | 37a78a0 | 2008-10-23 13:15:23 +0000 | [diff] [blame] | 42 | |
sewardj | 37a78a0 | 2008-10-23 13:15:23 +0000 | [diff] [blame] | 43 | </sect1> |
| 44 | |
| 45 | |
| 46 | |
sewardj | 3848827 | 2011-05-11 15:26:06 +0000 | [diff] [blame] | 47 | <sect1 id="sg-manual.how-works.sg-checks" |
| 48 | xreflabel="How SGCheck Works"> |
| 49 | <title>How SGCheck Works</title> |
sewardj | 37a78a0 | 2008-10-23 13:15:23 +0000 | [diff] [blame] | 50 | |
| 51 | <para>When a source file is compiled |
njn | 7e5d4ed | 2009-07-30 02:57:52 +0000 | [diff] [blame] | 52 | with <option>-g</option>, the compiler attaches DWARF3 |
sewardj | 37a78a0 | 2008-10-23 13:15:23 +0000 | [diff] [blame] | 53 | debugging information which describes the location of all stack and |
| 54 | global arrays in the file.</para> |
| 55 | |
| 56 | <para>Checking of accesses to such arrays would then be relatively |
| 57 | simple, if the compiler could also tell us which array (if any) each |
| 58 | memory referencing instruction was supposed to access. Unfortunately |
| 59 | the DWARF3 debugging format does not provide a way to represent such |
| 60 | information, so we have to resort to a heuristic technique to |
sewardj | 3848827 | 2011-05-11 15:26:06 +0000 | [diff] [blame] | 61 | approximate it. The key observation is that |
njn | 5cca9f9 | 2009-08-05 07:15:28 +0000 | [diff] [blame] | 62 | <emphasis> |
sewardj | 37a78a0 | 2008-10-23 13:15:23 +0000 | [diff] [blame] | 63 | if a memory referencing instruction accesses inside a stack or |
| 64 | global array once, then it is highly likely to always access that |
njn | 5cca9f9 | 2009-08-05 07:15:28 +0000 | [diff] [blame] | 65 | same array</emphasis>.</para> |
sewardj | 37a78a0 | 2008-10-23 13:15:23 +0000 | [diff] [blame] | 66 | |
| 67 | <para>To see how this might be useful, consider the following buggy |
| 68 | fragment:</para> |
| 69 | <programlisting><![CDATA[ |
| 70 | { int i, a[10]; // both are auto vars |
| 71 | for (i = 0; i <= 10; i++) |
| 72 | a[i] = 42; |
| 73 | } |
| 74 | ]]></programlisting> |
| 75 | |
| 76 | <para>At run time we will know the precise address |
| 77 | of <computeroutput>a[]</computeroutput> on the stack, and so we can |
| 78 | observe that the first store resulting from <computeroutput>a[i] = |
| 79 | 42</computeroutput> writes <computeroutput>a[]</computeroutput>, and |
| 80 | we will (correctly) assume that that instruction is intended always to |
| 81 | access <computeroutput>a[]</computeroutput>. Then, on the 11th |
| 82 | iteration, it accesses somewhere else, possibly a different local, |
| 83 | possibly an un-accounted for area of the stack (eg, spill slot), so |
sewardj | 3848827 | 2011-05-11 15:26:06 +0000 | [diff] [blame] | 84 | SGCheck reports an error.</para> |
sewardj | 37a78a0 | 2008-10-23 13:15:23 +0000 | [diff] [blame] | 85 | |
| 86 | <para>There is an important caveat.</para> |
| 87 | |
njn | 5cca9f9 | 2009-08-05 07:15:28 +0000 | [diff] [blame] | 88 | <para>Imagine a function such as <function>memcpy</function>, which is used |
| 89 | to read and write many different areas of memory over the lifetime of the |
| 90 | program. If we insist that the read and write instructions in its memory |
| 91 | copying loop only ever access one particular stack or global variable, we |
| 92 | will be flooded with errors resulting from calls to |
| 93 | <function>memcpy</function>.</para> |
sewardj | 37a78a0 | 2008-10-23 13:15:23 +0000 | [diff] [blame] | 94 | |
sewardj | 3848827 | 2011-05-11 15:26:06 +0000 | [diff] [blame] | 95 | <para>To avoid this problem, SGCheck instantiates fresh likely-target |
sewardj | 37a78a0 | 2008-10-23 13:15:23 +0000 | [diff] [blame] | 96 | records for each entry to a function, and discards them on exit. This |
sewardj | 3848827 | 2011-05-11 15:26:06 +0000 | [diff] [blame] | 97 | allows detection of cases where (e.g.) <function>memcpy</function> |
| 98 | overflows its source or destination buffers for any specific call, but |
| 99 | does not carry any restriction from one call to the next. Indeed, |
| 100 | multiple threads may make multiple simultaneous calls to |
| 101 | (e.g.) <function>memcpy</function> without mutual interference.</para> |
sewardj | 37a78a0 | 2008-10-23 13:15:23 +0000 | [diff] [blame] | 102 | |
| 103 | </sect1> |
| 104 | |
| 105 | |
| 106 | |
| 107 | |
sewardj | 3848827 | 2011-05-11 15:26:06 +0000 | [diff] [blame] | 108 | <sect1 id="sg-manual.cmp-w-memcheck" |
sewardj | 37a78a0 | 2008-10-23 13:15:23 +0000 | [diff] [blame] | 109 | xreflabel="Comparison with Memcheck"> |
| 110 | <title>Comparison with Memcheck</title> |
| 111 | |
sewardj | 3848827 | 2011-05-11 15:26:06 +0000 | [diff] [blame] | 112 | <para>SGCheck and Memcheck are complementary: their capabilities do |
| 113 | not overlap. Memcheck performs bounds checks and use-after-free |
| 114 | checks for heap arrays. It also finds uses of uninitialised values |
| 115 | created by heap or stack allocations. But it does not perform bounds |
| 116 | checking for stack or global arrays.</para> |
sewardj | 37a78a0 | 2008-10-23 13:15:23 +0000 | [diff] [blame] | 117 | |
sewardj | 3848827 | 2011-05-11 15:26:06 +0000 | [diff] [blame] | 118 | <para>SGCheck, on the other hand, does do bounds checking for stack or |
| 119 | global arrays, but it doesn't do anything else.</para> |
sewardj | 37a78a0 | 2008-10-23 13:15:23 +0000 | [diff] [blame] | 120 | |
| 121 | </sect1> |
| 122 | |
| 123 | |
| 124 | |
| 125 | |
| 126 | |
sewardj | 3848827 | 2011-05-11 15:26:06 +0000 | [diff] [blame] | 127 | <sect1 id="sg-manual.limitations" |
sewardj | 37a78a0 | 2008-10-23 13:15:23 +0000 | [diff] [blame] | 128 | xreflabel="Limitations"> |
| 129 | <title>Limitations</title> |
| 130 | |
| 131 | <para>This is an experimental tool, which relies rather too heavily on some |
| 132 | not-as-robust-as-I-would-like assumptions on the behaviour of correct |
| 133 | programs. There are a number of limitations which you should be aware |
| 134 | of.</para> |
| 135 | |
| 136 | <itemizedlist> |
| 137 | |
| 138 | <listitem> |
sewardj | 3848827 | 2011-05-11 15:26:06 +0000 | [diff] [blame] | 139 | <para>False negatives (missed errors): it follows from the |
| 140 | description above (<xref linkend="sg-manual.how-works.sg-checks"/>) |
| 141 | that the first access by a memory referencing instruction to a |
| 142 | stack or global array creates an association between that |
| 143 | instruction and the array, which is checked on subsequent accesses |
| 144 | by that instruction, until the containing function exits. Hence, |
| 145 | the first access by an instruction to an array (in any given |
| 146 | function instantiation) is not checked for overrun, since SGCheck |
njn | 5cca9f9 | 2009-08-05 07:15:28 +0000 | [diff] [blame] | 147 | uses that as the "example" of how subsequent accesses should |
| 148 | behave.</para> |
sewardj | 37a78a0 | 2008-10-23 13:15:23 +0000 | [diff] [blame] | 149 | </listitem> |
| 150 | |
| 151 | <listitem> |
sewardj | 3848827 | 2011-05-11 15:26:06 +0000 | [diff] [blame] | 152 | <para>False positives (false errors): similarly, and more serious, |
| 153 | it is clearly possible to write legitimate pieces of code which |
| 154 | break the basic assumption upon which the checking algorithm |
| 155 | depends. For example:</para> |
sewardj | 37a78a0 | 2008-10-23 13:15:23 +0000 | [diff] [blame] | 156 | |
| 157 | <programlisting><![CDATA[ |
| 158 | { int a[10], b[10], *p, i; |
| 159 | for (i = 0; i < 10; i++) { |
| 160 | p = /* arbitrary condition */ ? &a[i] : &b[i]; |
| 161 | *p = 42; |
| 162 | } |
| 163 | } |
| 164 | ]]></programlisting> |
| 165 | |
| 166 | <para>In this case the store sometimes |
| 167 | accesses <computeroutput>a[]</computeroutput> and |
| 168 | sometimes <computeroutput>b[]</computeroutput>, but in no cases is |
| 169 | the addressed array overrun. Nevertheless the change in target |
| 170 | will cause an error to be reported.</para> |
| 171 | |
| 172 | <para>It is hard to see how to get around this problem. The only |
| 173 | mitigating factor is that such constructions appear very rare, at |
| 174 | least judging from the results using the tool so far. Such a |
| 175 | construction appears only once in the Valgrind sources (running |
| 176 | Valgrind on Valgrind) and perhaps two or three times for a start |
| 177 | and exit of Firefox. The best that can be done is to suppress the |
| 178 | errors.</para> |
| 179 | </listitem> |
| 180 | |
| 181 | <listitem> |
sewardj | 3848827 | 2011-05-11 15:26:06 +0000 | [diff] [blame] | 182 | <para>Performance: SGCheck has to read all of |
sewardj | 37a78a0 | 2008-10-23 13:15:23 +0000 | [diff] [blame] | 183 | the DWARF3 type and variable information on the executable and its |
| 184 | shared objects. This is computationally expensive and makes |
| 185 | startup quite slow. You can expect debuginfo reading time to be in |
| 186 | the region of a minute for an OpenOffice sized application, on a |
| 187 | 2.4 GHz Core 2 machine. Reading this information also requires a |
sewardj | 3848827 | 2011-05-11 15:26:06 +0000 | [diff] [blame] | 188 | lot of memory. To make it viable, SGCheck goes to considerable |
sewardj | 37a78a0 | 2008-10-23 13:15:23 +0000 | [diff] [blame] | 189 | trouble to compress the in-memory representation of the DWARF3 |
| 190 | data, which is why the process of reading it appears slow.</para> |
| 191 | </listitem> |
| 192 | |
| 193 | <listitem> |
sewardj | 3848827 | 2011-05-11 15:26:06 +0000 | [diff] [blame] | 194 | <para>Performance: SGCheck runs slower than Memcheck. This is |
sewardj | 37a78a0 | 2008-10-23 13:15:23 +0000 | [diff] [blame] | 195 | partly due to a lack of tuning, but partly due to algorithmic |
sewardj | 3848827 | 2011-05-11 15:26:06 +0000 | [diff] [blame] | 196 | difficulties. The |
sewardj | 37a78a0 | 2008-10-23 13:15:23 +0000 | [diff] [blame] | 197 | stack and global checks can sometimes require a number of range |
sewardj | 3848827 | 2011-05-11 15:26:06 +0000 | [diff] [blame] | 198 | checks per memory access, and these are difficult to short-circuit, |
| 199 | despite considerable efforts having been made. A |
| 200 | redesign and reimplementation could potentially make it much faster. |
sewardj | 37a78a0 | 2008-10-23 13:15:23 +0000 | [diff] [blame] | 201 | </para> |
| 202 | </listitem> |
| 203 | |
| 204 | <listitem> |
sewardj | 3848827 | 2011-05-11 15:26:06 +0000 | [diff] [blame] | 205 | <para>Coverage: Stack and global checking is fragile. If a shared |
| 206 | object does not have debug information attached, then SGCheck will |
sewardj | 37a78a0 | 2008-10-23 13:15:23 +0000 | [diff] [blame] | 207 | not be able to determine the bounds of any stack or global arrays |
| 208 | defined within that shared object, and so will not be able to check |
| 209 | accesses to them. This is true even when those arrays are accessed |
| 210 | from some other shared object which was compiled with debug |
| 211 | info.</para> |
| 212 | |
sewardj | 3848827 | 2011-05-11 15:26:06 +0000 | [diff] [blame] | 213 | <para>At the moment SGCheck accepts objects lacking debuginfo |
| 214 | without comment. This is dangerous as it causes SGCheck to |
sewardj | 37a78a0 | 2008-10-23 13:15:23 +0000 | [diff] [blame] | 215 | silently skip stack and global checking for such objects. It would |
| 216 | be better to print a warning in such circumstances.</para> |
| 217 | </listitem> |
| 218 | |
| 219 | <listitem> |
mjw | 2be5122 | 2013-04-05 13:19:12 +0000 | [diff] [blame] | 220 | <para>Coverage: SGCheck does not check whether the areas read |
sewardj | 3848827 | 2011-05-11 15:26:06 +0000 | [diff] [blame] | 221 | or written by system calls do overrun stack or global arrays. This |
| 222 | would be easy to add.</para> |
sewardj | 37a78a0 | 2008-10-23 13:15:23 +0000 | [diff] [blame] | 223 | </listitem> |
| 224 | |
| 225 | <listitem> |
sewardj | 3848827 | 2011-05-11 15:26:06 +0000 | [diff] [blame] | 226 | <para>Platforms: the stack/global checks won't work properly on |
| 227 | PowerPC, ARM or S390X platforms, only on X86 and AMD64 targets. |
| 228 | That's because the stack and global checking requires tracking |
| 229 | function calls and exits reliably, and there's no obvious way to do |
| 230 | it on ABIs that use a link register for function returns. |
| 231 | </para> |
sewardj | 37a78a0 | 2008-10-23 13:15:23 +0000 | [diff] [blame] | 232 | </listitem> |
| 233 | |
| 234 | <listitem> |
| 235 | <para>Robustness: related to the previous point. Function |
sewardj | 3848827 | 2011-05-11 15:26:06 +0000 | [diff] [blame] | 236 | call/exit tracking for X86 and AMD64 is believed to work properly |
| 237 | even in the presence of longjmps within the same stack (although |
| 238 | this has not been tested). However, code which switches stacks is |
sewardj | 37a78a0 | 2008-10-23 13:15:23 +0000 | [diff] [blame] | 239 | likely to cause breakage/chaos.</para> |
| 240 | </listitem> |
| 241 | </itemizedlist> |
| 242 | |
| 243 | </sect1> |
| 244 | |
| 245 | |
| 246 | |
| 247 | |
| 248 | |
sewardj | 3848827 | 2011-05-11 15:26:06 +0000 | [diff] [blame] | 249 | <sect1 id="sg-manual.todo-user-visible" |
njn | 5cca9f9 | 2009-08-05 07:15:28 +0000 | [diff] [blame] | 250 | xreflabel="Still To Do: User-visible Functionality"> |
| 251 | <title>Still To Do: User-visible Functionality</title> |
sewardj | 37a78a0 | 2008-10-23 13:15:23 +0000 | [diff] [blame] | 252 | |
| 253 | <itemizedlist> |
| 254 | |
| 255 | <listitem> |
| 256 | <para>Extend system call checking to work on stack and global arrays.</para> |
| 257 | </listitem> |
| 258 | |
| 259 | <listitem> |
| 260 | <para>Print a warning if a shared object does not have debug info |
| 261 | attached, or if, for whatever reason, debug info could not be |
| 262 | found, or read.</para> |
| 263 | </listitem> |
| 264 | |
sewardj | 3848827 | 2011-05-11 15:26:06 +0000 | [diff] [blame] | 265 | <listitem> |
| 266 | <para>Add some heuristic filtering that removes obvious false |
| 267 | positives. This would be easy to do. For example, an access |
| 268 | transition from a heap to a stack object almost certainly isn't a |
| 269 | bug and so should not be reported to the user.</para> |
| 270 | </listitem> |
| 271 | |
sewardj | 37a78a0 | 2008-10-23 13:15:23 +0000 | [diff] [blame] | 272 | </itemizedlist> |
| 273 | |
| 274 | </sect1> |
| 275 | |
| 276 | |
| 277 | |
| 278 | |
sewardj | 3848827 | 2011-05-11 15:26:06 +0000 | [diff] [blame] | 279 | <sect1 id="sg-manual.todo-implementation" |
sewardj | 37a78a0 | 2008-10-23 13:15:23 +0000 | [diff] [blame] | 280 | xreflabel="Still To Do: Implementation Tidying"> |
| 281 | <title>Still To Do: Implementation Tidying</title> |
| 282 | |
| 283 | <para>Items marked CRITICAL are considered important for correctness: |
| 284 | non-fixage of them is liable to lead to crashes or assertion failures |
| 285 | in real use.</para> |
| 286 | |
| 287 | <itemizedlist> |
| 288 | |
| 289 | <listitem> |
sewardj | 3848827 | 2011-05-11 15:26:06 +0000 | [diff] [blame] | 290 | <para> sg_main.c: Redesign and reimplement the basic checking |
| 291 | algorithm. It could be done much faster than it is -- the current |
| 292 | implementation isn't very good. |
| 293 | </para> |
sewardj | 37a78a0 | 2008-10-23 13:15:23 +0000 | [diff] [blame] | 294 | </listitem> |
sewardj | 3848827 | 2011-05-11 15:26:06 +0000 | [diff] [blame] | 295 | |
sewardj | 37a78a0 | 2008-10-23 13:15:23 +0000 | [diff] [blame] | 296 | <listitem> |
| 297 | <para> sg_main.c: Improve the performance of the stack / global |
| 298 | checks by doing some up-front filtering to ignore references in |
| 299 | areas which "obviously" can't be stack or globals. This will |
| 300 | require using information that m_aspacemgr knows about the address |
| 301 | space layout.</para> |
| 302 | </listitem> |
| 303 | |
| 304 | <listitem> |
sewardj | 37a78a0 | 2008-10-23 13:15:23 +0000 | [diff] [blame] | 305 | <para>sg_main.c: fix compute_II_hash to make it a bit more sensible |
| 306 | for ppc32/64 targets (except that sg_ doesn't work on ppc32/64 |
njn | 5cca9f9 | 2009-08-05 07:15:28 +0000 | [diff] [blame] | 307 | targets, so this is a bit academic at the moment).</para> |
sewardj | 37a78a0 | 2008-10-23 13:15:23 +0000 | [diff] [blame] | 308 | </listitem> |
| 309 | |
| 310 | </itemizedlist> |
| 311 | |
| 312 | </sect1> |
| 313 | |
| 314 | |
| 315 | |
| 316 | </chapter> |