XML-ise exp-ptrcheck's documentation.


git-svn-id: svn://svn.valgrind.org/valgrind/trunk@8702 a5019735-40e9-0310-863c-91ae7b9d1cf9
diff --git a/docs/xml/manual.xml b/docs/xml/manual.xml
index 149afa5..727570a 100644
--- a/docs/xml/manual.xml
+++ b/docs/xml/manual.xml
@@ -36,6 +36,8 @@
       xmlns:xi="http://www.w3.org/2001/XInclude" />
   <xi:include href="../../massif/docs/ms-manual.xml" parse="xml"  
       xmlns:xi="http://www.w3.org/2001/XInclude" />
+  <xi:include href="../../exp-ptrcheck/docs/pc-manual.xml" parse="xml"  
+      xmlns:xi="http://www.w3.org/2001/XInclude" />
   <xi:include href="../../none/docs/nl-manual.xml" parse="xml"  
       xmlns:xi="http://www.w3.org/2001/XInclude" />
   <xi:include href="../../lackey/docs/lk-manual.xml" parse="xml"  
diff --git a/docs/xml/valgrind-manpage.xml b/docs/xml/valgrind-manpage.xml
index c30b77c..e45d72a 100644
--- a/docs/xml/valgrind-manpage.xml
+++ b/docs/xml/valgrind-manpage.xml
@@ -250,6 +250,17 @@
 
 
 
+<refsect1 id="ptrcheck-options">
+<title>Ptrcheck Options</title>
+
+<xi:include href="../../exp-ptrcheck/docs/pc-manual.xml" 
+            xpointer="pc.opts.list"
+            xmlns:xi="http://www.w3.org/2001/XInclude" />
+
+</refsect1>
+
+
+
 <refsect1 id="lackey-options">
 <title>Lackey Options</title>
 
diff --git a/exp-ptrcheck/docs/Makefile.am b/exp-ptrcheck/docs/Makefile.am
index e69de29..60d2880 100644
--- a/exp-ptrcheck/docs/Makefile.am
+++ b/exp-ptrcheck/docs/Makefile.am
@@ -0,0 +1 @@
+EXTRA_DIST = pc-manual.xml
diff --git a/exp-ptrcheck/docs/pc-manual.xml b/exp-ptrcheck/docs/pc-manual.xml
new file mode 100644
index 0000000..f3d142d
--- /dev/null
+++ b/exp-ptrcheck/docs/pc-manual.xml
@@ -0,0 +1,531 @@
+<?xml version="1.0"?> <!-- -*- sgml -*- -->
+<!DOCTYPE chapter PUBLIC "-//OASIS//DTD DocBook XML V4.2//EN"
+          "http://www.oasis-open.org/docbook/xml/4.2/docbookx.dtd"
+[ <!ENTITY % vg-entities SYSTEM "../../docs/xml/vg-entities.xml"> %vg-entities; ]>
+
+
+<chapter id="pc-manual" 
+         xreflabel="Ptrcheck: an (experimental) pointer checking tool">
+  <title>Ptrcheck: an (experimental) pointer checking tool</title>
+
+<para>To use this tool, you must specify
+<computeroutput>--tool=exp-ptrcheck</computeroutput> on the Valgrind
+command line.</para>
+
+
+
+
+<sect1 id="pc-manual.overview" xreflabel="Overview">
+<title>Overview</title>
+
+<para>Ptrcheck is a Valgrind tool for finding overruns of heap, stack
+and global arrays.  Its functionality overlaps somewhat with
+Memcheck's, but it is able to catch invalid accesses in a number of
+cases that Memcheck would miss.  A detailed comparison against
+Memcheck is presented below.</para>
+
+<para>Ptrcheck is composed of two almost completely independent tools
+that have been glued together.  One part,
+in <computeroutput>h_main.[ch]</computeroutput>, checks accesses
+through heap-derived pointers.  The other part, in
+<computeroutput>sg_main.[ch]</computeroutput>, checks accesses to
+stack and global arrays.  The remaining
+files <computeroutput>pc_{common,main}.[ch]</computeroutput>, provide
+common error-management and coordination functions, so as to make it
+appear as a single tool.</para>
+
+<para>The heap-check part is an extensively-hacked (largely rewritten)
+version of the experimental "Annelid" tool developed and described by
+Nicholas Nethercote and Jeremy Fitzhardinge.  The stack- and global-
+check part uses a heuristic approach derived from an observation about
+the likely forms of stack and global array accesses, and, as far as is
+known, is entirely novel.</para>
+
+</sect1>
+
+
+
+
+<sect1 id="pc-manual.options" xreflabel="Ptrcheck Options">
+<title>Ptrcheck Options</title>
+
+<para>The following end-user options are available:</para>
+
+<!-- start of xi:include in the manpage -->
+<variablelist id="pc.opts.list">
+
+  <varlistentry id="opt.enable-sg-checks" xreflabel="--enable-sg-checks">
+    <term>
+      <option><![CDATA[--enable-sg-checks=no|yes
+      [default: yes] ]]></option>
+    </term>
+    <listitem>
+      <para>By default, Ptrcheck checks for overruns of stack, global
+       and heap arrays.
+       With <varname>--enable-sg-checks=no</varname>, the stack and
+       global array checks are omitted, and only heap checking is
+       performed.  This can be useful because the stack and global
+       checks are quite expensive, so omitting them speeds Ptrcheck up
+       a lot.
+      </para>
+    </listitem>
+  </varlistentry>
+
+  <varlistentry id="opt.partial-loads-ok" xreflabel="--partial-loads-ok">
+    <term>
+      <option><![CDATA[--partial-loads-ok=<yes|no> [default: no] ]]></option>
+    </term>
+    <listitem>
+      <para>This option has the same meaning as it does for
+      Memcheck.</para>
+      <para>Controls how Ptrcheck handles word-sized, word-aligned
+      loads which partially overlap the end of heap blocks -- that is,
+      some of the bytes in the word are validly addressable, but
+      others are not.  When <varname>yes</varname>, such loads do not
+      produce an address error.  When <varname>no</varname> (the
+      default), loads from partially invalid addresses are treated the
+      same as loads from completely invalid addresses: an illegal heap
+      access error is issued.
+      </para>
+      <para>Note that code that behaves in this way is in violation of
+      the the ISO C/C++ standards, and should be considered broken.  If
+      at all possible, such code should be fixed.  This flag should be
+      used only as a last resort.</para>
+    </listitem>
+  </varlistentry>
+
+</variablelist>
+<!-- end of xi:include in the manpage -->
+
+<!-- start of xi:include in the manpage -->
+<para>In addition, the following debugging options are available for
+Ptrcheck:</para>
+
+<variablelist id="hg.debugopts.list">
+
+  <varlistentry id="opt.trace-malloc" xreflabel="--trace-malloc">
+    <term>
+      <option><![CDATA[--trace-malloc=no|yes [no]
+      ]]></option>
+    </term>
+    <listitem>
+      <para>Show all client malloc (etc) and free (etc) requests.</para>
+    </listitem>
+  </varlistentry>
+
+</variablelist>
+<!-- end of xi:include in the manpage -->
+
+
+</sect1>
+
+
+
+
+<sect1 id="pc-manual.how-works.heap-checks"
+       xreflabel="How Ptrcheck Works: Heap Checks">
+<title>How Ptrcheck Works: Heap Checks</title>
+
+<para>Ptrcheck can check for invalid uses of heap pointers, including
+out of range accesses and accesses to freed memory.  The mechanism is
+however completely different from Memcheck's, and the checking is more
+powerful.</para>
+
+<para>For each pointer in the program, Ptrcheck keeps track of which
+heap block (if any) it was derived from.  Then, when an access is made
+through that pointer, Ptrcheck compares the access address with the
+bounds of the associated block, and reports an error if the address is
+out of bounds, or if the block has been freed.</para>
+
+<para>Of course it is rarely the case that one wants to access a block
+only at the exact address returned by malloc (et al).  Ptrcheck
+understands that adding or subtracting offsets from a pointer to a
+block results in a pointer to the same block.</para>
+
+<para>At a fundamental level, this scheme works because a correct
+program cannot make assumptions about the addresses returned by
+malloc.  In particular it cannot make any assumptions about the
+differences in addresses returned by subsequent calls to malloc.
+Hence there are very few ways to take an address returned by malloc,
+modify it, and still have a valid address.  In short, the only
+allowable operations are adding and subtracting other non-pointer
+values.  Almost all other operations produce a value which cannot
+possibly be a valid pointer.</para>
+
+</sect1>
+
+
+
+<sect1 id="pc-manual.how-works.sg-checks"
+       xreflabel="How Ptrcheck Works: Stack and Global Checks">
+<title>How Ptrcheck Works: Stack and Global Checks</title>
+
+<para>When a source file is compiled
+with <computeroutput>-g</computeroutput>, the compiler attaches DWARF3
+debugging information which describes the location of all stack and
+global arrays in the file.</para>
+
+<para>Checking of accesses to such arrays would then be relatively
+simple, if the compiler could also tell us which array (if any) each
+memory referencing instruction was supposed to access.  Unfortunately
+the DWARF3 debugging format does not provide a way to represent such
+information, so we have to resort to a heuristic technique to
+approximate the same information.  The key observation is that
+</para>
+
+   <para>
+   if a memory referencing instruction accesses inside a stack or
+   global array once, then it is highly likely to always access that
+   same array</para>
+
+<para>To see how this might be useful, consider the following buggy
+fragment:</para>
+<programlisting><![CDATA[
+   { int i, a[10];  // both are auto vars
+     for (i = 0; i <= 10; i++)
+        a[i] = 42;
+   }
+]]></programlisting>
+
+<para>At run time we will know the precise address
+of <computeroutput>a[]</computeroutput> on the stack, and so we can
+observe that the first store resulting from <computeroutput>a[i] =
+42</computeroutput> writes <computeroutput>a[]</computeroutput>, and
+we will (correctly) assume that that instruction is intended always to
+access <computeroutput>a[]</computeroutput>.  Then, on the 11th
+iteration, it accesses somewhere else, possibly a different local,
+possibly an un-accounted for area of the stack (eg, spill slot), so
+Ptrcheck reports an error.</para>
+
+<para>There is an important caveat.</para>
+
+<para>Imagine a function such as memcpy, which is used to read and
+write many different areas of memory over the lifetime of the program.
+If we insist that the read and write instructions in its memory
+copying loop only ever access one particular stack or global variable,
+we will be flooded with errors resulting from calls to memcpy.</para>
+
+<para>To avoid this problem, Ptrcheck instantiates fresh likely-target
+records for each entry to a function, and discards them on exit.  This
+allows detection of cases where (eg) memcpy overflows its source or
+destination buffers for any specific call, but does not carry any
+restriction from one call to the next.  Indeed, multiple threads may
+be multiple simultaneous calls to (eg) memcpy without mutual
+interference.</para>
+
+</sect1>
+
+
+
+
+<sect1 id="pc-manual.cmp-w-memcheck"
+       xreflabel="Comparison with Memcheck">
+<title>Comparison with Memcheck</title>
+
+<para>Memcheck does not do any access checks for stack or global arrays, so
+the presence of those in Ptrcheck is a straight win.  (But see
+"Limitations" below).</para>
+
+<para>Memcheck and Ptrcheck use different approaches for checking heap
+accesses.  Memcheck maintains bitmaps telling it which areas of memory
+are accessible and which are not.  If a memory access falls in an
+unaccessible area, it reports an error.  By marking the 16 bytes
+before and after an allocated block unaccessible, Memcheck is able to
+detect small over- and underruns of the block.  Similarly, by marking
+freed memory as unaccessible, Memcheck can detect all accesses to
+freed memory.</para>
+
+<para>Memcheck's approach is simple.  But it's also weak.  It can't
+catch block overruns beyond 16 bytes.  And, more generally, because it
+focusses only on the question "is the target address accessible", it
+fails to detect invalid accesses which just happen to fall within some
+other valid area.  This is not improbable, especially in crowded areas
+of the process' address space.</para>
+
+<para>Ptrcheck's approach is to keep track of pointers derived from
+heap blocks.  It tracks pointers which are derived directly from calls
+to malloc et al, but also ones derived indirectly, by adding or
+subtracting offsets from the directly-derived pointers.  When a
+pointer is finally used to access memory, Ptrcheck compares the access
+address with that of the block it was originally derived from, and
+reports an error if the access address is not within the block
+bounds.</para>
+
+<para>Consequently Ptrcheck can detect any out of bounds access
+through a heap-derived pointer, no matter how far from the original
+block it is.</para>
+
+<para>A second advantage is that Ptrcheck is better at detecting
+accesses to blocks freed very far in the past.  Memcheck can detect
+these too, but only for blocks freed relatively recently.  To detect
+accesses to a freed block, Memcheck must make it inaccessible, hence
+requiring a space overhead proportional to the size of the block.  If
+the blocks are large, Memcheck will have to make them available for
+re-allocation relatively quickly, thereby losing the ability to detect
+invalid accesses to them.</para>
+
+<para>By contrast, Ptrcheck has a constant per-block space requirement
+of four machine words, for detection of accesses to freed blocks.  A
+freed block can be reallocated immediately, yet Ptrcheck can still
+detect all invalid accesses through any pointers derived from the old
+allocation, providing only that the four-word descriptor for the old
+allocation is stored.  For example, on a 64-bit machine, to detect
+accesses in any of the most recently freed 10 million blocks, Ptrcheck
+will require only 320MB of extra storage.  Achieving the same level of
+detection with Memcheck is close to impossible and would likely
+involve several gigabytes of extra storage.</para>
+
+<para>In defense of Memcheck ...</para>
+
+<para>Remember that Memcheck performs uninitialised value checking,
+which Ptrcheck does not.  Memcheck has also benefitted from years of
+refinement, tuning, and experience with production-level usage, and so
+is much faster than Ptrcheck as it currently stands, as of October
+2008.</para>
+
+<para>Consequently it is recommended to first make your programs run
+Memcheck clean.  Once that's done, try Ptrcheck to see if you can
+shake out any further heap, global or stack errors.</para>
+
+</sect1>
+
+
+
+
+
+<sect1 id="pc-manual.limitations"
+       xreflabel="Limitations">
+<title>Limitations</title>
+
+<para>This is an experimental tool, which relies rather too heavily on some
+not-as-robust-as-I-would-like assumptions on the behaviour of correct
+programs.  There are a number of limitations which you should be aware
+of.</para>
+
+<itemizedlist>
+
+  <listitem>
+   <para>Heap checks: Ptrcheck can occasionally lose track of, or
+   become confused about, which heap block a given pointer has been
+   derived from.  This can cause it to falsely report errors, or to
+   miss some errors.  This is not believed to be a serious
+   problem.</para>
+  </listitem>
+
+  <listitem>
+   <para>Heap checks: Ptrcheck only tracks pointers that are stored
+   properly aligned in memory.  If a pointer is stored at a misaligned
+   address, and then later read again, Ptrcheck will lose track of
+   what it points at.  Similar problem if a pointer is split into
+   pieces and later reconsitituted.</para>
+  </listitem>
+
+  <listitem>
+   <para>Heap checks: Ptrcheck needs to "understand" which system
+   calls return pointers and which don't.  Many, but not all system
+   calls are handled.  If an unhandled one is encountered, Ptrcheck
+   will abort.</para>
+  </listitem>
+
+  <listitem>
+   <para>Stack checks: It follows from the description above (How Ptrcheck
+   Works: Stack and Global Checks) that the first access by a memory
+   referencing instruction to a stack or global array creates an
+   association between that instruction and the array, which is
+   checked on subsequent accesses by that instruction, until the
+   containing function exits.  Hence, the first access by an
+   instruction to an array (in any given function instantiation) is
+   not checked for overrun, since Ptrcheck uses that as the "example"
+   of how subsequent accesses should behave.</para>
+  </listitem>
+
+  <listitem>
+   <para>Stack checks: Similarly, and more serious, it is clearly
+   possible to write legitimate pieces of code which break the basic
+   assumption upon which the stack/global checking rests.  For
+   example:</para>
+
+<programlisting><![CDATA[
+  { int a[10], b[10], *p, i;
+    for (i = 0; i < 10; i++) {
+       p = /* arbitrary condition */  ? &a[i]  : &b[i];
+       *p = 42;
+    }
+  }
+]]></programlisting>
+
+   <para>In this case the store sometimes
+   accesses <computeroutput>a[]</computeroutput> and
+   sometimes <computeroutput>b[]</computeroutput>, but in no cases is
+   the addressed array overrun.  Nevertheless the change in target
+   will cause an error to be reported.</para>
+
+   <para>It is hard to see how to get around this problem.  The only
+   mitigating factor is that such constructions appear very rare, at
+   least judging from the results using the tool so far.  Such a
+   construction appears only once in the Valgrind sources (running
+   Valgrind on Valgrind) and perhaps two or three times for a start
+   and exit of Firefox.  The best that can be done is to suppress the
+   errors.</para>
+  </listitem>
+
+  <listitem>
+   <para>Performance: the stack/global checks require reading all of
+   the DWARF3 type and variable information on the executable and its
+   shared objects.  This is computationally expensive and makes
+   startup quite slow.  You can expect debuginfo reading time to be in
+   the region of a minute for an OpenOffice sized application, on a
+   2.4 GHz Core 2 machine.  Reading this information also requires a
+   lot of memory.  To make it viable, Ptrcheck goes to considerable
+   trouble to compress the in-memory representation of the DWARF3
+   data, which is why the process of reading it appears slow.</para>
+  </listitem>
+
+  <listitem>
+   <para>Performance: Ptrcheck runs slower than Memcheck.  This is
+   partly due to a lack of tuning, but partly due to algorithmic
+   difficulties.  The heap-check side is potentially quite fast.  The
+   stack and global checks can sometimes require a number of range
+   checks per memory access, and these are difficult to short-circuit
+   (despite considerable efforts having been made).
+   </para>
+  </listitem>
+
+  <listitem>
+   <para>Coverage: the heap checking is relatively robust, requiring
+   only that Ptrcheck can see calls to malloc/free et al.  In that
+   sense it has debug-info requirements comparable with Memcheck, and
+   is able to heap-check programs even with no debugging information
+   attached.</para>
+
+   <para>Stack/global checking is much more fragile.  If a shared
+   object does not have debug information attached, then Ptrcheck will
+   not be able to determine the bounds of any stack or global arrays
+   defined within that shared object, and so will not be able to check
+   accesses to them.  This is true even when those arrays are accessed
+   from some other shared object which was compiled with debug
+   info.</para>
+
+   <para>At the moment Ptrcheck accepts objects lacking debuginfo
+   without comment.  This is dangerous as it causes Ptrcheck to
+   silently skip stack and global checking for such objects.  It would
+   be better to print a warning in such circumstances.</para>
+  </listitem>
+
+  <listitem>
+   <para>Coverage: Ptrcheck checks that the areas read or written by
+   system calls do not overrun heap blocks.  But it doesn't currently
+   check them for overruns stack and global arrays.  This would be
+   easy to add.</para>
+  </listitem>
+
+  <listitem>
+   <para>Platforms: the stack/global checks won't work properly on any
+   PowerPC platforms, only on x86 and amd64 targets.  That's because
+   the stack and global checking requires tracking function calls and
+   exits reliably, and there's no obvious way to do it with the PPC
+   ABIs.  (cf with the x86 and amd64 ABIs this is relatively
+   straightforward.)</para>
+  </listitem>
+
+  <listitem>
+   <para>Robustness: related to the previous point.  Function
+   call/exit tracking for x86/amd64 is believed to work properly even
+   in the presence of longjmps within the same stack (although this
+   has not been tested).  However, code which switches stacks is
+   likely to cause breakage/chaos.</para>
+  </listitem>
+</itemizedlist>
+
+</sect1>
+
+
+
+
+
+<sect1 id="pc-manual.todo-user-visible"
+       xreflabel="Still To Do: User Visible Functionality">
+<title>Still To Do: User Visible Functionality</title>
+
+<itemizedlist>
+
+  <listitem>
+   <para>Extend system call checking to work on stack and global arrays.</para>
+  </listitem>
+
+  <listitem>
+   <para>Print a warning if a shared object does not have debug info
+   attached, or if, for whatever reason, debug info could not be
+   found, or read.</para>
+  </listitem>
+
+</itemizedlist>
+
+</sect1>
+
+
+
+
+<sect1 id="pc-manual.todo-implementation"
+       xreflabel="Still To Do: Implementation Tidying">
+<title>Still To Do: Implementation Tidying</title>
+
+<para>Items marked CRITICAL are considered important for correctness:
+non-fixage of them is liable to lead to crashes or assertion failures
+in real use.</para>
+
+<itemizedlist>
+
+  <listitem>
+   <para>h_main.c: make N_FREED_SEGS command-line configurable.</para>
+  </listitem>
+ 
+  <listitem>
+   <para> sg_main.c: Improve the performance of the stack / global
+   checks by doing some up-front filtering to ignore references in
+   areas which "obviously" can't be stack or globals.  This will
+   require using information that m_aspacemgr knows about the address
+   space layout.</para>
+  </listitem>
+ 
+  <listitem>
+   <para>h_main.c: get rid of the last_seg_added hack; add suitable
+   plumbing to the core/tool interface to do this cleanly.</para>
+  </listitem>
+  
+  <listitem>
+   <para>h_main.c: move vast amounts of arch-dependent uglyness
+   (get_IntRegInfo et al) to its own source file, a la
+   mc_machine.c.</para>
+  </listitem>
+ 
+  <listitem>
+   <para>h_main.c: make the lossage-check stuff work again, as a way
+   of doing quality assurance on the implementation.</para>
+  </listitem>
+  
+  <listitem>
+   <para>h_main.c: schemeEw_Atom: don't generate a call to
+   nonptr_or_unknown, this is really stupid, since it could be done at
+   translation time instead.</para>
+  </listitem>
+  
+  <listitem>
+   <para>CRITICAL: h_main.c: h_instrument (main instrumentation fn):
+   generate shadows for word-sized temps defined in the block's
+   preamble.  (Why does this work at all, as it stands?)</para>
+  </listitem>
+  
+  <listitem>
+   <para>sg_main.c: fix compute_II_hash to make it a bit more sensible
+   for ppc32/64 targets (except that sg_ doesn't work on ppc32/64
+   targets, so this is a bit academic at the mo).</para>
+  </listitem>
+  
+</itemizedlist>
+
+</sect1>
+
+
+
+</chapter>