| <?xml version="1.0"?> <!-- -*- sgml -*- --> |
| <!DOCTYPE chapter PUBLIC "-//OASIS//DTD DocBook XML V4.2//EN" |
| "http://www.oasis-open.org/docbook/xml/4.2/docbookx.dtd" |
| [ <!ENTITY % vg-entities SYSTEM "../../docs/xml/vg-entities.xml"> %vg-entities; ]> |
| |
| |
| <chapter id="sg-manual" |
| xreflabel="SGCheck: an experimental stack and global array overrun detector"> |
| <title>SGCheck: an experimental stack and global array overrun detector</title> |
| |
| <para>To use this tool, you must specify |
| <option>--tool=exp-sgcheck</option> on the Valgrind |
| command line.</para> |
| |
| |
| |
| |
| <sect1 id="sg-manual.overview" xreflabel="Overview"> |
| <title>Overview</title> |
| |
| <para>SGCheck is a tool for finding overruns of stack and global |
| arrays. It works by using a heuristic approach derived from an |
| observation about the likely forms of stack and global array accesses. |
| </para> |
| |
| </sect1> |
| |
| |
| |
| |
| <sect1 id="sg-manual.options" xreflabel="SGCheck Command-line Options"> |
| <title>SGCheck Command-line Options</title> |
| |
| <para id="sg.opts.list">There are no SGCheck-specific command-line options at present.</para> |
| <!-- |
| <para>SGCheck-specific command-line options are:</para> |
| |
| |
| <variablelist id="sg.opts.list"> |
| </variablelist> |
| --> |
| |
| </sect1> |
| |
| |
| |
| <sect1 id="sg-manual.how-works.sg-checks" |
| xreflabel="How SGCheck Works"> |
| <title>How SGCheck Works</title> |
| |
| <para>When a source file is compiled |
| with <option>-g</option>, the compiler attaches DWARF3 |
| debugging information which describes the location of all stack and |
| global arrays in the file.</para> |
| |
| <para>Checking of accesses to such arrays would then be relatively |
| simple, if the compiler could also tell us which array (if any) each |
| memory referencing instruction was supposed to access. Unfortunately |
| the DWARF3 debugging format does not provide a way to represent such |
| information, so we have to resort to a heuristic technique to |
| approximate it. The key observation is that |
| <emphasis> |
| if a memory referencing instruction accesses inside a stack or |
| global array once, then it is highly likely to always access that |
| same array</emphasis>.</para> |
| |
| <para>To see how this might be useful, consider the following buggy |
| fragment:</para> |
| <programlisting><![CDATA[ |
| { int i, a[10]; // both are auto vars |
| for (i = 0; i <= 10; i++) |
| a[i] = 42; |
| } |
| ]]></programlisting> |
| |
| <para>At run time we will know the precise address |
| of <computeroutput>a[]</computeroutput> on the stack, and so we can |
| observe that the first store resulting from <computeroutput>a[i] = |
| 42</computeroutput> writes <computeroutput>a[]</computeroutput>, and |
| we will (correctly) assume that that instruction is intended always to |
| access <computeroutput>a[]</computeroutput>. Then, on the 11th |
| iteration, it accesses somewhere else, possibly a different local, |
| possibly an un-accounted for area of the stack (eg, spill slot), so |
| SGCheck reports an error.</para> |
| |
| <para>There is an important caveat.</para> |
| |
| <para>Imagine a function such as <function>memcpy</function>, which is used |
| to read and write many different areas of memory over the lifetime of the |
| program. If we insist that the read and write instructions in its memory |
| copying loop only ever access one particular stack or global variable, we |
| will be flooded with errors resulting from calls to |
| <function>memcpy</function>.</para> |
| |
| <para>To avoid this problem, SGCheck instantiates fresh likely-target |
| records for each entry to a function, and discards them on exit. This |
| allows detection of cases where (e.g.) <function>memcpy</function> |
| overflows its source or destination buffers for any specific call, but |
| does not carry any restriction from one call to the next. Indeed, |
| multiple threads may make multiple simultaneous calls to |
| (e.g.) <function>memcpy</function> without mutual interference.</para> |
| |
| </sect1> |
| |
| |
| |
| |
| <sect1 id="sg-manual.cmp-w-memcheck" |
| xreflabel="Comparison with Memcheck"> |
| <title>Comparison with Memcheck</title> |
| |
| <para>SGCheck and Memcheck are complementary: their capabilities do |
| not overlap. Memcheck performs bounds checks and use-after-free |
| checks for heap arrays. It also finds uses of uninitialised values |
| created by heap or stack allocations. But it does not perform bounds |
| checking for stack or global arrays.</para> |
| |
| <para>SGCheck, on the other hand, does do bounds checking for stack or |
| global arrays, but it doesn't do anything else.</para> |
| |
| </sect1> |
| |
| |
| |
| |
| |
| <sect1 id="sg-manual.limitations" |
| xreflabel="Limitations"> |
| <title>Limitations</title> |
| |
| <para>This is an experimental tool, which relies rather too heavily on some |
| not-as-robust-as-I-would-like assumptions on the behaviour of correct |
| programs. There are a number of limitations which you should be aware |
| of.</para> |
| |
| <itemizedlist> |
| |
| <listitem> |
| <para>False negatives (missed errors): it follows from the |
| description above (<xref linkend="sg-manual.how-works.sg-checks"/>) |
| that the first access by a memory referencing instruction to a |
| stack or global array creates an association between that |
| instruction and the array, which is checked on subsequent accesses |
| by that instruction, until the containing function exits. Hence, |
| the first access by an instruction to an array (in any given |
| function instantiation) is not checked for overrun, since SGCheck |
| uses that as the "example" of how subsequent accesses should |
| behave.</para> |
| </listitem> |
| |
| <listitem> |
| <para>False positives (false errors): similarly, and more serious, |
| it is clearly possible to write legitimate pieces of code which |
| break the basic assumption upon which the checking algorithm |
| depends. For example:</para> |
| |
| <programlisting><![CDATA[ |
| { int a[10], b[10], *p, i; |
| for (i = 0; i < 10; i++) { |
| p = /* arbitrary condition */ ? &a[i] : &b[i]; |
| *p = 42; |
| } |
| } |
| ]]></programlisting> |
| |
| <para>In this case the store sometimes |
| accesses <computeroutput>a[]</computeroutput> and |
| sometimes <computeroutput>b[]</computeroutput>, but in no cases is |
| the addressed array overrun. Nevertheless the change in target |
| will cause an error to be reported.</para> |
| |
| <para>It is hard to see how to get around this problem. The only |
| mitigating factor is that such constructions appear very rare, at |
| least judging from the results using the tool so far. Such a |
| construction appears only once in the Valgrind sources (running |
| Valgrind on Valgrind) and perhaps two or three times for a start |
| and exit of Firefox. The best that can be done is to suppress the |
| errors.</para> |
| </listitem> |
| |
| <listitem> |
| <para>Performance: SGCheck has to read all of |
| the DWARF3 type and variable information on the executable and its |
| shared objects. This is computationally expensive and makes |
| startup quite slow. You can expect debuginfo reading time to be in |
| the region of a minute for an OpenOffice sized application, on a |
| 2.4 GHz Core 2 machine. Reading this information also requires a |
| lot of memory. To make it viable, SGCheck goes to considerable |
| trouble to compress the in-memory representation of the DWARF3 |
| data, which is why the process of reading it appears slow.</para> |
| </listitem> |
| |
| <listitem> |
| <para>Performance: SGCheck runs slower than Memcheck. This is |
| partly due to a lack of tuning, but partly due to algorithmic |
| difficulties. The |
| stack and global checks can sometimes require a number of range |
| checks per memory access, and these are difficult to short-circuit, |
| despite considerable efforts having been made. A |
| redesign and reimplementation could potentially make it much faster. |
| </para> |
| </listitem> |
| |
| <listitem> |
| <para>Coverage: Stack and global checking is fragile. If a shared |
| object does not have debug information attached, then SGCheck will |
| not be able to determine the bounds of any stack or global arrays |
| defined within that shared object, and so will not be able to check |
| accesses to them. This is true even when those arrays are accessed |
| from some other shared object which was compiled with debug |
| info.</para> |
| |
| <para>At the moment SGCheck accepts objects lacking debuginfo |
| without comment. This is dangerous as it causes SGCheck to |
| silently skip stack and global checking for such objects. It would |
| be better to print a warning in such circumstances.</para> |
| </listitem> |
| |
| <listitem> |
| <para>Coverage: SGCheck does not check whether the areas read |
| or written by system calls do overrun stack or global arrays. This |
| would be easy to add.</para> |
| </listitem> |
| |
| <listitem> |
| <para>Platforms: the stack/global checks won't work properly on |
| PowerPC, ARM or S390X platforms, only on X86 and AMD64 targets. |
| That's because the stack and global checking requires tracking |
| function calls and exits reliably, and there's no obvious way to do |
| it on ABIs that use a link register for function returns. |
| </para> |
| </listitem> |
| |
| <listitem> |
| <para>Robustness: related to the previous point. Function |
| call/exit tracking for X86 and AMD64 is believed to work properly |
| even in the presence of longjmps within the same stack (although |
| this has not been tested). However, code which switches stacks is |
| likely to cause breakage/chaos.</para> |
| </listitem> |
| </itemizedlist> |
| |
| </sect1> |
| |
| |
| |
| |
| |
| <sect1 id="sg-manual.todo-user-visible" |
| xreflabel="Still To Do: User-visible Functionality"> |
| <title>Still To Do: User-visible Functionality</title> |
| |
| <itemizedlist> |
| |
| <listitem> |
| <para>Extend system call checking to work on stack and global arrays.</para> |
| </listitem> |
| |
| <listitem> |
| <para>Print a warning if a shared object does not have debug info |
| attached, or if, for whatever reason, debug info could not be |
| found, or read.</para> |
| </listitem> |
| |
| <listitem> |
| <para>Add some heuristic filtering that removes obvious false |
| positives. This would be easy to do. For example, an access |
| transition from a heap to a stack object almost certainly isn't a |
| bug and so should not be reported to the user.</para> |
| </listitem> |
| |
| </itemizedlist> |
| |
| </sect1> |
| |
| |
| |
| |
| <sect1 id="sg-manual.todo-implementation" |
| xreflabel="Still To Do: Implementation Tidying"> |
| <title>Still To Do: Implementation Tidying</title> |
| |
| <para>Items marked CRITICAL are considered important for correctness: |
| non-fixage of them is liable to lead to crashes or assertion failures |
| in real use.</para> |
| |
| <itemizedlist> |
| |
| <listitem> |
| <para> sg_main.c: Redesign and reimplement the basic checking |
| algorithm. It could be done much faster than it is -- the current |
| implementation isn't very good. |
| </para> |
| </listitem> |
| |
| <listitem> |
| <para> sg_main.c: Improve the performance of the stack / global |
| checks by doing some up-front filtering to ignore references in |
| areas which "obviously" can't be stack or globals. This will |
| require using information that m_aspacemgr knows about the address |
| space layout.</para> |
| </listitem> |
| |
| <listitem> |
| <para>sg_main.c: fix compute_II_hash to make it a bit more sensible |
| for ppc32/64 targets (except that sg_ doesn't work on ppc32/64 |
| targets, so this is a bit academic at the moment).</para> |
| </listitem> |
| |
| </itemizedlist> |
| |
| </sect1> |
| |
| |
| |
| </chapter> |