blob: 3ce5ca0bd3f861d4e1f74fe3cf803eed74e791ad [file] [log] [blame]
<?xml version="1.0"?> <!-- -*- sgml -*- -->
<!DOCTYPE chapter PUBLIC "-//OASIS//DTD DocBook XML V4.2//EN"
"http://www.oasis-open.org/docbook/xml/4.2/docbookx.dtd"
[ <!ENTITY % vg-entities SYSTEM "../../docs/xml/vg-entities.xml"> %vg-entities; ]>
<chapter id="pc-manual"
xreflabel="Ptrcheck: an experimental heap, stack &amp; global array overrun detector">
<title>Ptrcheck: an experimental heap, stack &amp; global array overrun detector</title>
<para>To use this tool, you must specify
<computeroutput>--tool=exp-ptrcheck</computeroutput> on the Valgrind
command line.</para>
<sect1 id="pc-manual.overview" xreflabel="Overview">
<title>Overview</title>
<para>Ptrcheck is a tool for finding overruns of heap, stack
and global arrays. Its functionality overlaps somewhat with
Memcheck's, but it is able to catch invalid accesses in a number of
cases that Memcheck would miss. A detailed comparison against
Memcheck is presented below.</para>
<para>Ptrcheck is composed of two almost completely independent tools
that have been glued together. One part,
in <computeroutput>h_main.[ch]</computeroutput>, checks accesses
through heap-derived pointers. The other part, in
<computeroutput>sg_main.[ch]</computeroutput>, checks accesses to
stack and global arrays. The remaining
files <computeroutput>pc_{common,main}.[ch]</computeroutput>, provide
common error-management and coordination functions, so as to make it
appear as a single tool.</para>
<para>The heap-check part is an extensively-hacked (largely rewritten)
version of the experimental "Annelid" tool developed and described by
Nicholas Nethercote and Jeremy Fitzhardinge. The stack- and global-
check part uses a heuristic approach derived from an observation about
the likely forms of stack and global array accesses, and, as far as is
known, is entirely novel.</para>
</sect1>
<sect1 id="pc-manual.options" xreflabel="Ptrcheck Options">
<title>Ptrcheck Options</title>
<para>The following end-user options are available:</para>
<!-- start of xi:include in the manpage -->
<variablelist id="pc.opts.list">
<varlistentry id="opt.enable-sg-checks" xreflabel="--enable-sg-checks">
<term>
<option><![CDATA[--enable-sg-checks=no|yes
[default: yes] ]]></option>
</term>
<listitem>
<para>By default, Ptrcheck checks for overruns of stack, global
and heap arrays.
With <varname>--enable-sg-checks=no</varname>, the stack and
global array checks are omitted, and only heap checking is
performed. This can be useful because the stack and global
checks are quite expensive, so omitting them speeds Ptrcheck up
a lot.
</para>
</listitem>
</varlistentry>
<varlistentry id="opt.partial-loads-ok" xreflabel="--partial-loads-ok">
<term>
<option><![CDATA[--partial-loads-ok=<yes|no> [default: no] ]]></option>
</term>
<listitem>
<para>This option has the same meaning as it does for
Memcheck.</para>
<para>Controls how Ptrcheck handles word-sized, word-aligned
loads which partially overlap the end of heap blocks -- that is,
some of the bytes in the word are validly addressable, but
others are not. When <varname>yes</varname>, such loads do not
produce an address error. When <varname>no</varname> (the
default), loads from partially invalid addresses are treated the
same as loads from completely invalid addresses: an illegal heap
access error is issued.
</para>
<para>Note that code that behaves in this way is in violation of
the the ISO C/C++ standards, and should be considered broken. If
at all possible, such code should be fixed. This flag should be
used only as a last resort.</para>
</listitem>
</varlistentry>
</variablelist>
<!-- end of xi:include in the manpage -->
<!-- start of xi:include in the manpage -->
<para>In addition, the following debugging options are available for
Ptrcheck:</para>
<variablelist id="hg.debugopts.list">
<varlistentry id="opt.trace-malloc" xreflabel="--trace-malloc">
<term>
<option><![CDATA[--trace-malloc=no|yes [no]
]]></option>
</term>
<listitem>
<para>Show all client malloc (etc) and free (etc) requests.</para>
</listitem>
</varlistentry>
</variablelist>
<!-- end of xi:include in the manpage -->
</sect1>
<sect1 id="pc-manual.how-works.heap-checks"
xreflabel="How Ptrcheck Works: Heap Checks">
<title>How Ptrcheck Works: Heap Checks</title>
<para>Ptrcheck can check for invalid uses of heap pointers, including
out of range accesses and accesses to freed memory. The mechanism is
however completely different from Memcheck's, and the checking is more
powerful.</para>
<para>For each pointer in the program, Ptrcheck keeps track of which
heap block (if any) it was derived from. Then, when an access is made
through that pointer, Ptrcheck compares the access address with the
bounds of the associated block, and reports an error if the address is
out of bounds, or if the block has been freed.</para>
<para>Of course it is rarely the case that one wants to access a block
only at the exact address returned by malloc (et al). Ptrcheck
understands that adding or subtracting offsets from a pointer to a
block results in a pointer to the same block.</para>
<para>At a fundamental level, this scheme works because a correct
program cannot make assumptions about the addresses returned by
malloc. In particular it cannot make any assumptions about the
differences in addresses returned by subsequent calls to malloc.
Hence there are very few ways to take an address returned by malloc,
modify it, and still have a valid address. In short, the only
allowable operations are adding and subtracting other non-pointer
values. Almost all other operations produce a value which cannot
possibly be a valid pointer.</para>
</sect1>
<sect1 id="pc-manual.how-works.sg-checks"
xreflabel="How Ptrcheck Works: Stack and Global Checks">
<title>How Ptrcheck Works: Stack and Global Checks</title>
<para>When a source file is compiled
with <computeroutput>-g</computeroutput>, the compiler attaches DWARF3
debugging information which describes the location of all stack and
global arrays in the file.</para>
<para>Checking of accesses to such arrays would then be relatively
simple, if the compiler could also tell us which array (if any) each
memory referencing instruction was supposed to access. Unfortunately
the DWARF3 debugging format does not provide a way to represent such
information, so we have to resort to a heuristic technique to
approximate the same information. The key observation is that
</para>
<para>
if a memory referencing instruction accesses inside a stack or
global array once, then it is highly likely to always access that
same array</para>
<para>To see how this might be useful, consider the following buggy
fragment:</para>
<programlisting><![CDATA[
{ int i, a[10]; // both are auto vars
for (i = 0; i <= 10; i++)
a[i] = 42;
}
]]></programlisting>
<para>At run time we will know the precise address
of <computeroutput>a[]</computeroutput> on the stack, and so we can
observe that the first store resulting from <computeroutput>a[i] =
42</computeroutput> writes <computeroutput>a[]</computeroutput>, and
we will (correctly) assume that that instruction is intended always to
access <computeroutput>a[]</computeroutput>. Then, on the 11th
iteration, it accesses somewhere else, possibly a different local,
possibly an un-accounted for area of the stack (eg, spill slot), so
Ptrcheck reports an error.</para>
<para>There is an important caveat.</para>
<para>Imagine a function such as memcpy, which is used to read and
write many different areas of memory over the lifetime of the program.
If we insist that the read and write instructions in its memory
copying loop only ever access one particular stack or global variable,
we will be flooded with errors resulting from calls to memcpy.</para>
<para>To avoid this problem, Ptrcheck instantiates fresh likely-target
records for each entry to a function, and discards them on exit. This
allows detection of cases where (eg) memcpy overflows its source or
destination buffers for any specific call, but does not carry any
restriction from one call to the next. Indeed, multiple threads may
be multiple simultaneous calls to (eg) memcpy without mutual
interference.</para>
</sect1>
<sect1 id="pc-manual.cmp-w-memcheck"
xreflabel="Comparison with Memcheck">
<title>Comparison with Memcheck</title>
<para>Memcheck does not do any access checks for stack or global arrays, so
the presence of those in Ptrcheck is a straight win. (But see
"Limitations" below).</para>
<para>Memcheck and Ptrcheck use different approaches for checking heap
accesses. Memcheck maintains bitmaps telling it which areas of memory
are accessible and which are not. If a memory access falls in an
unaccessible area, it reports an error. By marking the 16 bytes
before and after an allocated block unaccessible, Memcheck is able to
detect small over- and underruns of the block. Similarly, by marking
freed memory as unaccessible, Memcheck can detect all accesses to
freed memory.</para>
<para>Memcheck's approach is simple. But it's also weak. It can't
catch block overruns beyond 16 bytes. And, more generally, because it
focusses only on the question "is the target address accessible", it
fails to detect invalid accesses which just happen to fall within some
other valid area. This is not improbable, especially in crowded areas
of the process' address space.</para>
<para>Ptrcheck's approach is to keep track of pointers derived from
heap blocks. It tracks pointers which are derived directly from calls
to malloc et al, but also ones derived indirectly, by adding or
subtracting offsets from the directly-derived pointers. When a
pointer is finally used to access memory, Ptrcheck compares the access
address with that of the block it was originally derived from, and
reports an error if the access address is not within the block
bounds.</para>
<para>Consequently Ptrcheck can detect any out of bounds access
through a heap-derived pointer, no matter how far from the original
block it is.</para>
<para>A second advantage is that Ptrcheck is better at detecting
accesses to blocks freed very far in the past. Memcheck can detect
these too, but only for blocks freed relatively recently. To detect
accesses to a freed block, Memcheck must make it inaccessible, hence
requiring a space overhead proportional to the size of the block. If
the blocks are large, Memcheck will have to make them available for
re-allocation relatively quickly, thereby losing the ability to detect
invalid accesses to them.</para>
<para>By contrast, Ptrcheck has a constant per-block space requirement
of four machine words, for detection of accesses to freed blocks. A
freed block can be reallocated immediately, yet Ptrcheck can still
detect all invalid accesses through any pointers derived from the old
allocation, providing only that the four-word descriptor for the old
allocation is stored. For example, on a 64-bit machine, to detect
accesses in any of the most recently freed 10 million blocks, Ptrcheck
will require only 320MB of extra storage. Achieving the same level of
detection with Memcheck is close to impossible and would likely
involve several gigabytes of extra storage.</para>
<para>In defense of Memcheck ...</para>
<para>Remember that Memcheck performs uninitialised value checking,
which Ptrcheck does not. Memcheck has also benefitted from years of
refinement, tuning, and experience with production-level usage, and so
is much faster than Ptrcheck as it currently stands, as of October
2008.</para>
<para>Consequently it is recommended to first make your programs run
Memcheck clean. Once that's done, try Ptrcheck to see if you can
shake out any further heap, global or stack errors.</para>
</sect1>
<sect1 id="pc-manual.limitations"
xreflabel="Limitations">
<title>Limitations</title>
<para>This is an experimental tool, which relies rather too heavily on some
not-as-robust-as-I-would-like assumptions on the behaviour of correct
programs. There are a number of limitations which you should be aware
of.</para>
<itemizedlist>
<listitem>
<para>Heap checks: Ptrcheck can occasionally lose track of, or
become confused about, which heap block a given pointer has been
derived from. This can cause it to falsely report errors, or to
miss some errors. This is not believed to be a serious
problem.</para>
</listitem>
<listitem>
<para>Heap checks: Ptrcheck only tracks pointers that are stored
properly aligned in memory. If a pointer is stored at a misaligned
address, and then later read again, Ptrcheck will lose track of
what it points at. Similar problem if a pointer is split into
pieces and later reconsitituted.</para>
</listitem>
<listitem>
<para>Heap checks: Ptrcheck needs to "understand" which system
calls return pointers and which don't. Many, but not all system
calls are handled. If an unhandled one is encountered, Ptrcheck
will abort.</para>
</listitem>
<listitem>
<para>Stack checks: It follows from the description above (How Ptrcheck
Works: Stack and Global Checks) that the first access by a memory
referencing instruction to a stack or global array creates an
association between that instruction and the array, which is
checked on subsequent accesses by that instruction, until the
containing function exits. Hence, the first access by an
instruction to an array (in any given function instantiation) is
not checked for overrun, since Ptrcheck uses that as the "example"
of how subsequent accesses should behave.</para>
</listitem>
<listitem>
<para>Stack checks: Similarly, and more serious, it is clearly
possible to write legitimate pieces of code which break the basic
assumption upon which the stack/global checking rests. For
example:</para>
<programlisting><![CDATA[
{ int a[10], b[10], *p, i;
for (i = 0; i < 10; i++) {
p = /* arbitrary condition */ ? &a[i] : &b[i];
*p = 42;
}
}
]]></programlisting>
<para>In this case the store sometimes
accesses <computeroutput>a[]</computeroutput> and
sometimes <computeroutput>b[]</computeroutput>, but in no cases is
the addressed array overrun. Nevertheless the change in target
will cause an error to be reported.</para>
<para>It is hard to see how to get around this problem. The only
mitigating factor is that such constructions appear very rare, at
least judging from the results using the tool so far. Such a
construction appears only once in the Valgrind sources (running
Valgrind on Valgrind) and perhaps two or three times for a start
and exit of Firefox. The best that can be done is to suppress the
errors.</para>
</listitem>
<listitem>
<para>Performance: the stack/global checks require reading all of
the DWARF3 type and variable information on the executable and its
shared objects. This is computationally expensive and makes
startup quite slow. You can expect debuginfo reading time to be in
the region of a minute for an OpenOffice sized application, on a
2.4 GHz Core 2 machine. Reading this information also requires a
lot of memory. To make it viable, Ptrcheck goes to considerable
trouble to compress the in-memory representation of the DWARF3
data, which is why the process of reading it appears slow.</para>
</listitem>
<listitem>
<para>Performance: Ptrcheck runs slower than Memcheck. This is
partly due to a lack of tuning, but partly due to algorithmic
difficulties. The heap-check side is potentially quite fast. The
stack and global checks can sometimes require a number of range
checks per memory access, and these are difficult to short-circuit
(despite considerable efforts having been made).
</para>
</listitem>
<listitem>
<para>Coverage: the heap checking is relatively robust, requiring
only that Ptrcheck can see calls to malloc/free et al. In that
sense it has debug-info requirements comparable with Memcheck, and
is able to heap-check programs even with no debugging information
attached.</para>
<para>Stack/global checking is much more fragile. If a shared
object does not have debug information attached, then Ptrcheck will
not be able to determine the bounds of any stack or global arrays
defined within that shared object, and so will not be able to check
accesses to them. This is true even when those arrays are accessed
from some other shared object which was compiled with debug
info.</para>
<para>At the moment Ptrcheck accepts objects lacking debuginfo
without comment. This is dangerous as it causes Ptrcheck to
silently skip stack and global checking for such objects. It would
be better to print a warning in such circumstances.</para>
</listitem>
<listitem>
<para>Coverage: Ptrcheck checks that the areas read or written by
system calls do not overrun heap blocks. But it doesn't currently
check them for overruns stack and global arrays. This would be
easy to add.</para>
</listitem>
<listitem>
<para>Platforms: the stack/global checks won't work properly on any
PowerPC platforms, only on x86 and amd64 targets. That's because
the stack and global checking requires tracking function calls and
exits reliably, and there's no obvious way to do it with the PPC
ABIs. (cf with the x86 and amd64 ABIs this is relatively
straightforward.)</para>
</listitem>
<listitem>
<para>Robustness: related to the previous point. Function
call/exit tracking for x86/amd64 is believed to work properly even
in the presence of longjmps within the same stack (although this
has not been tested). However, code which switches stacks is
likely to cause breakage/chaos.</para>
</listitem>
</itemizedlist>
</sect1>
<sect1 id="pc-manual.todo-user-visible"
xreflabel="Still To Do: User Visible Functionality">
<title>Still To Do: User Visible Functionality</title>
<itemizedlist>
<listitem>
<para>Extend system call checking to work on stack and global arrays.</para>
</listitem>
<listitem>
<para>Print a warning if a shared object does not have debug info
attached, or if, for whatever reason, debug info could not be
found, or read.</para>
</listitem>
</itemizedlist>
</sect1>
<sect1 id="pc-manual.todo-implementation"
xreflabel="Still To Do: Implementation Tidying">
<title>Still To Do: Implementation Tidying</title>
<para>Items marked CRITICAL are considered important for correctness:
non-fixage of them is liable to lead to crashes or assertion failures
in real use.</para>
<itemizedlist>
<listitem>
<para>h_main.c: make N_FREED_SEGS command-line configurable.</para>
</listitem>
<listitem>
<para> sg_main.c: Improve the performance of the stack / global
checks by doing some up-front filtering to ignore references in
areas which "obviously" can't be stack or globals. This will
require using information that m_aspacemgr knows about the address
space layout.</para>
</listitem>
<listitem>
<para>h_main.c: get rid of the last_seg_added hack; add suitable
plumbing to the core/tool interface to do this cleanly.</para>
</listitem>
<listitem>
<para>h_main.c: move vast amounts of arch-dependent uglyness
(get_IntRegInfo et al) to its own source file, a la
mc_machine.c.</para>
</listitem>
<listitem>
<para>h_main.c: make the lossage-check stuff work again, as a way
of doing quality assurance on the implementation.</para>
</listitem>
<listitem>
<para>h_main.c: schemeEw_Atom: don't generate a call to
nonptr_or_unknown, this is really stupid, since it could be done at
translation time instead.</para>
</listitem>
<listitem>
<para>CRITICAL: h_main.c: h_instrument (main instrumentation fn):
generate shadows for word-sized temps defined in the block's
preamble. (Why does this work at all, as it stands?)</para>
</listitem>
<listitem>
<para>sg_main.c: fix compute_II_hash to make it a bit more sensible
for ppc32/64 targets (except that sg_ doesn't work on ppc32/64
targets, so this is a bit academic at the mo).</para>
</listitem>
</itemizedlist>
</sect1>
</chapter>