We have a nice little collection of text files describing various high
level things. But they're all over the place. This commits moves
them all to the new docs/internals/ directory, and gives them
a consistent naming scheme.
git-svn-id: svn://svn.valgrind.org/valgrind/trunk@4196 a5019735-40e9-0310-863c-91ae7b9d1cf9
diff --git a/docs/Makefile.am b/docs/Makefile.am
index bf55b6d..e266b30 100644
--- a/docs/Makefile.am
+++ b/docs/Makefile.am
@@ -1,11 +1,6 @@
-SUBDIRS = xml lib images
+SUBDIRS = xml lib images internals
-EXTRA_DIST = \
- 64-bit-cleanness \
- directory-structure \
- README \
- porting-HOWTO porting-to-ARM \
- tm-mutexstates.dot tm-threadstates.dot
+EXTRA_DIST = README
dist_man_MANS = valgrind.1
diff --git a/docs/README b/docs/README
index 250aa5c..57cafd3 100644
--- a/docs/README
+++ b/docs/README
@@ -26,6 +26,8 @@
docs/xml/vg-entities.xml: Various strings, dates etc. used all over
docs/xml/xml_help.txt: Basic guide to common XML tags.
+The docs/internals directory contains some useful high-level stuff about
+Valgrind's internals. It's not relevant for the rest of this discussion.
Overview
---------
diff --git a/docs/64-bit-cleanness b/docs/internals/64-bit-cleanness.txt
similarity index 89%
rename from docs/64-bit-cleanness
rename to docs/internals/64-bit-cleanness.txt
index 45d4241..f0a3013 100644
--- a/docs/64-bit-cleanness
+++ b/docs/internals/64-bit-cleanness.txt
@@ -1,6 +1,12 @@
-----------------------------------------------------------------------------
64-bit cleanness
-----------------------------------------------------------------------------
+
+[19-Jul-2005: I assume most of these are gone, now that AMD64 is working
+pretty well. The Addrcheck and Helgrind ones are probably still true,
+though. --njn]
+
+
The following are places I know or suspect contain code that is not 64-bit
clean. Please mark them off this list as they are fixed, and add any new ones
you know of.
diff --git a/docs/directory-structure b/docs/internals/directory-structure.txt
similarity index 100%
rename from docs/directory-structure
rename to docs/internals/directory-structure.txt
diff --git a/docs/internals/m_replacemalloc.txt b/docs/internals/m_replacemalloc.txt
new file mode 100644
index 0000000..2fde258
--- /dev/null
+++ b/docs/internals/m_replacemalloc.txt
@@ -0,0 +1,29 @@
+The structure of this module is worth noting.
+
+The main part is in vg_replace_malloc.c. It gets compiled into the tool's
+'preload' shared object, which goes into the client's area of memory, and
+runs on the simulated CPU just like client code. As a result, it cannot
+use any functions in the core directly; it can only communicate with the
+core using client requests, just like any other client code.
+
+And yet it must call the tool's malloc wrappers. How does it know where
+they are? The init function uses a client request which asks for the list
+of all the core functions (and variables) that it needs to access. It then
+uses a client request each time it needs to call one of these.
+
+This means that the following sequence occurs each time a tool that uses
+this module starts up:
+
+ - Tool does initialisation, including calling VG_(malloc_funcs)() to tell
+ the core the names of its malloc wrappers. These are stored in
+ VG_(tdict).
+
+ - On the first allocation, vg_replace_malloc.c:init() calls the
+ GET_MALLOCFUNCS client request to get the names of the malloc wrappers
+ out of VG_(tdict), storing them in 'info'.
+
+ - All calls to these functions are done using 'info'.
+
+This is a bit complex, but it's hard to see how it can be done more simply.
+
+
diff --git a/docs/internals/m_syswrap.txt b/docs/internals/m_syswrap.txt
new file mode 100644
index 0000000..1e7c0fb
--- /dev/null
+++ b/docs/internals/m_syswrap.txt
@@ -0,0 +1,14 @@
+
+This module handles the complex business of handing system calls off
+to the host and then fixing up the guest state accordingly. It
+interacts complicatedly with signals and to a less extent threads.
+
+There are some important caveats regarding how to write the PRE and
+POST wrappers for syscalls. It is important to observe these, else
+you will have to track down almost impossibly obscure bugs. These
+caveats are described in comments at the top of syswrap-main.c.
+
+The main file is syswrap-main.c. It contains all the driver logic
+and a great deal of commentary. The wrappers themselves live in
+syswrap-generic.c, syswrap-${OS}.c and syswrap-${PLATFORM}.c.
+
diff --git a/docs/internals/module-structure.txt b/docs/internals/module-structure.txt
new file mode 100644
index 0000000..4f66de2
--- /dev/null
+++ b/docs/internals/module-structure.txt
@@ -0,0 +1,59 @@
+
+Our long term goal is to move to structure Valgrind's top level as a
+set of well-defined modules. Much of the difficulty in maintaining
+the beast is caused by the lack of clear boundaries, definitions and
+semantics for subsystems (modules), and in particular a lack of
+clarity about which modules may depend on which others. The ongoing
+modularisation activities are aimed at dealing with this problem.
+
+Architecture dependent stuff will be chopped up and placed into the
+relevant modules. Since the system's top level is now to be
+structured as modules with clearly delimited areas of functionality,
+directories such as 'amd64', 'amd64-linux', etc, cannot continue to
+exist long-term. These trees contain mish-mashes of functionality
+from multiple different modules, and so make no sense as top-level
+entities in a scheme where all top-level entities are modules.
+
+This process is ongoing. Consequently some of the code in coregrind/
+has been bought into the module structure, but much hasn't. A naming
+scheme distinguishes the done vs not-done stuff:
+
+ Consider a module of name 'foo'.
+
+ If 'foo' is implemented in a single C file, and requires no other
+ files, it will live in coregrind/m_foo.c.
+
+ Otherwise (if 'foo' requires more than one C file, or more than
+ zero private header files, or any other kind of auxiliary stuff)
+ then it will live in the directory coregrind/m_foo.
+
+Each module 'foo' must have two associated header files which describe
+its public (exported) interface:
+
+ include/pub_tool_foo.h
+ coregrind/pub_core_foo.h
+
+pub_tool_foo.h describes that part of the module's functionality that
+is visible to tools. Hopefully this can be minimal or zero. If there
+is nothing to visible to tool, pub_tool_foo.h can be omitted.
+
+pub_core_foo.h describes functionality that is visible to other
+modules in the core. This is a strict superset of the visible-to-tool
+functionality. Consequently, pub_core_foo.h *must* #include
+pub_tool_foo.h, if it exists. pub_tool_foo.h *must not* #include
+pub_core_foo.h, nor any other pub_core_ header for that matter.
+
+Module-private headers are named "priv_foo.h".
+
+No module may include the private headers of any other module. If a
+type/enum/function/struct/whatever is stated in neither
+include/pub_tool_foo.h nor coregrind/pub_core_foo.h then module 'foo'
+DOES NOT EXPORT IT.
+
+Over time it is hoped to develop some simple Perl scripts to scan
+source files for #includes so as to mechanically enforce these rules.
+One of the most infuriating aspects of C is the total lack of support
+for building properly abstracted subsystems. This is in sharp
+comparison to languages such as Modula3, Haskell, ML, all of which
+have support for modules built into the language, and hence such
+boundaries are enforceable by the compiler.
diff --git a/docs/internals/notes.txt b/docs/internals/notes.txt
new file mode 100644
index 0000000..e15966f
--- /dev/null
+++ b/docs/internals/notes.txt
@@ -0,0 +1,168 @@
+20 Jun 05
+~~~~~~~~~
+PPC32 port
+* Paul wrote some code to deal with setting/clearing reservations.
+ (grep USE_MACHINE_RESERVATION, ARCH_SWITCH_TO, lwarx, stwcx.)
+ Not yet looked into, but this may be needed.
+
+11 May 05
+~~~~~~~~~
+ToDo: vex-amd64: check above/below the line for reg-alloc
+
+23 Apr 05 (memcheck-on-amd64 notes)
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+* If a thread is given an initial stack with address range [lo .. hi],
+ we need to tell memcheck that the area [lo - VGA_STACK_REDZONE_SZB
+ .. hi] is valid, rather than just [lo .. hi] as has been the case on
+ x86-only systems. However, am not sure where to look for the call
+ into memcheck that states the new stack area.
+
+Notes pertaining to the 2.4.0 - 3.0 merge
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+As of 10 March (svn rev 3266, vex svn rev 1019) the merged code base
+can start and run programs with --tool=none. Both threaded and
+unthreaded programs appear to work (knode, opera, konqueror).
+
+Known breakage is:
+
+* Basically only x86 works. I was part-way through getting amd64
+ to work when I stopped to do the merge. I think you can assume
+ amd64 is pretty much knackered right now.
+
+* No other tools work. Memcheck worked fine in 3.0 prior to the
+ merge but needs to have Jeremy's space-saving hacks folded in.
+ Also the leak checker improvements. Ditto addrcheck.
+ Cachegrind is broken because it is not Vex-aware, and Vex needs
+ to be changed to convey info on instruction boundaries to it.
+ Helgrind is not Vex aware. Also, Helgrind will not work because
+ thread-event-modelling does not work (see below). Memcheck
+ and Addrcheck could be made to work with minor effort, and
+ that should happen asap. Cachegrind also needs to be fixed
+ shortly.
+
+* Function wrapping a la 2.4.0 is disabled, and will likely remain
+ disabled for an extended period until I consider the software
+ engineering consequences of it, specifically if a cleaner
+ implementation is possible. Result of that is that thread-event
+ modelling and Helgrind are also disabled for that period.
+
+* signal contexts for x86 signal deliveries are partially broken. On
+ delivery of an rt-signal, a context frame is built, but only the 8
+ integer registers and %eflags are written into it, no SSE and no FP
+ state. Also, the vcpu state is restored on return to whatever it
+ was before the signal was delivered; it is not restored from the
+ sigcontext offered to the handler. That means handlers which
+ expect to be able to modify the machine state will not work.
+ This will be fixed; it requires a small amount of work on the
+ Vex side.
+
+* I got rid of extra UInt* flags arg for syscall pre wrappers,
+ so they can't add MayBlock after examining the args. Should
+ be reinstated. I commented out various *flags |= MayBlock"
+ so they can easily enough be put back in.
+
+* Tracking of device segments is somehow broken (I forget how)
+
+* Core dumping is disabled (has been for a while in the 3.0 line)
+ because it needs to be factored per arch (or is it per arch+os).
+
+
+Other notes I made:
+
+* Check tests/filter_stderr_basic; I got confused whilst merging it
+
+* Dubious use of setjmp in run_thread_for_a_while -- I thought it
+ was only OK to use setjmp as the arg of an if: if (setjmp(...)) ...
+
+* EmWarn/Int confusion -- what type is it in the guest state?
+
+* Reinstate per-thread dispatch ctrs. First find out what the
+ rationale for per-thread counters is.
+
+* main: TL_(fini) is not given exitcode and it should be.
+
+* Prototype for VG_(_client_syscall) [note leading _] is in a
+ bad place.
+
+(It was a 3-way merge, using the most recent common ancestor
+ of the 2.4.0 and 3.0 lines:
+
+ cvs co -D "11/19/2004 17:45:00 GMT" valgrind
+
+ and the 2.4.0 line
+
+ obtained at Fri Mar 4 15:52:46 GMT 2005 by:
+ cvs co valgrind
+
+ and the 3.0 line, which is svn revision 3261.
+)
+
+
+Cleanup notes derived from making AMD64 work. JRS, started 2 March 05.
+~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
+The following cleanups need to be done.
+
+AMD64 vsyscalls
+~~~~~~~~~~~~~~~
+The redirect mechanism should (could) be used to support vsyscalls on
+both amd64 and x86, by redirecting jumps to the vsyscall entry
+point(s) to appropriate helper stubs instead. There is no point in
+using the current x86 scheme of copying the trampoline code around the
+place and making the AT_SYSINFO entry point at it, as that mechanism
+does not work on amd64.
+
+On x86-linux, the vsyscall address is whatever the AT_SYSINFO entry
+says it is. Reroute all jumps to that to a suitable stub.
+
+On amd64, there are multiple vsyscall entry points at -10M +
+1024*vsyscall_no (currently there are only two). These each need to be
+redirected to suitable stubs which do normal syscalls instead.
+
+These redirects should be set up as part of platform-specific
+initialisation sequences. They should not be set up as at present in
+vg_symtab2.c. All this stuff should be within platform-specific
+startup code, and should not be visible in generic core service code.
+
+
+Redirection mechanism
+~~~~~~~~~~~~~~~~~~~~~
+How this works is difficult to understand. This should be fixed. The
+list of unresolved redirections should be a seperate data structure
+from the currently active (addr, addr) mapping.
+
+There's a whole big #ifdef TEST section in vg_symtab2.c which has
+no apparent purpose.
+
+The redirecting-symtab-loader seems like a good idea on the face
+of it: you can write functions whose name says, in effect
+ "i_am_a_replacement_for_FOO"
+and then all jumps/calls to FOO get redirected there. Problem is
+that nameing mechanism involves $ signs etc in symbol names, which
+makes it very fragile. TODO: (1) figure out if we still need
+this, and if so (2) fix.
+
+
+System call handlers
+~~~~~~~~~~~~~~~~~~~~
+The pre/post functions should be factored into: marshallers, which get
+the syscall args from wherever they live, and handlers proper, which
+do whatever pre/post checks/hanldling is needed. The handlers are
+more or less platform independent. The marshallers insulate the
+handlers from details of knowing how to get hold of syscall arg/result
+values given that different platforms use different and sometimes
+strange calling conventions.
+
+The syscall handlers assume that the result register (RES) does not
+overlap with any argument register (ARGn). They assume this by
+blithely referring to ARGn in the post-handlers. This should be fixed
+properly -- before the call, a copy of the args should be saved so
+they can be safely inspected after the call.
+
+The mechanisms by which a pre-handler can complete a syscall itself
+without handing it off to the kernel need to be cleaned up. The
+"Special" syscall designation no longer really makes sense (it never
+did) and should be removed.
+
+Sockets: move the socketcall marshaller from vg_syscalls.c into
+x86-linux/syscalls.c; it is in the wrong place.
+
diff --git a/docs/porting-HOWTO b/docs/internals/porting-HOWTO.txt
similarity index 100%
rename from docs/porting-HOWTO
rename to docs/internals/porting-HOWTO.txt
diff --git a/docs/porting-to-ARM b/docs/internals/porting-to-ARM.txt
similarity index 100%
rename from docs/porting-to-ARM
rename to docs/internals/porting-to-ARM.txt
diff --git a/docs/internals/segments-seginfos.txt b/docs/internals/segments-seginfos.txt
new file mode 100644
index 0000000..23513af
--- /dev/null
+++ b/docs/internals/segments-seginfos.txt
@@ -0,0 +1,59 @@
+
+-----------------------------------------------------------------------------
+Info about the relationship between Segments and SegInfos
+-----------------------------------------------------------------------------
+
+SegInfo is from the very original Valgrind code, and so it predates
+Segments. It's poorly named now; its really just a container for all
+the object file metadata (symbols, debug info, etc).
+
+Segments describe memory mapped into the address space, and so any
+address-space chaging operation needs to update the Segment structure.
+After the process is initalized, this means one of:
+
+ * mmap
+ * munmap
+ * mprotect
+ * brk
+ * stack growth
+
+A piece of address space may or may not be mmaped from a file.
+
+A SegInfo specifically describes memory mmaped from an ELF object file.
+Because a single ELF file may be mmaped with multiple Segments, multiple
+Segments can point to one Seginfo. A SegInfo can relate to a memory
+range which is not yet mmaped. For example, if the process mmaps the
+first page of an ELF file (the one containing the header), a SegInfo
+will be created for that ELF file's mappings, which will include memory
+which will be later mmaped by the client's ELF loader. If a new mmap
+appears in the address range of an existing SegInfo, it will have that
+SegInfo attached to it, presumably because its part of a .so file.
+Similarly, if a Segment gets split (by mprotect, for example), the two
+pieces will still be associated with the same SegInfo. For this reason,
+the address/length info in a SegInfo is not a duplicate of the Segment
+address/length.
+
+This is complex for several reasons:
+
+ 1. We assume that if a process is mmaping a file which contains an
+ ELF header, it intends to use it as an ELF object. If a program
+ which just mmaps ELF files but just uses it as raw data (copy, for
+ example), we still treat it as a shared-library opening.
+ 2. Even if it is being loaded as a shared library/other ELF object,
+ Valgrind doesn't control the mmaps. It just observes the mmaps
+ being generated by the client and has to cope. One of the reasons
+ that Valgrind has to make its own mmap of each .so for reading
+ symtab information is because the client won't necessary mmap the
+ right pieces, or do so in the wrong order for us.
+
+SegInfos are reference counted, and freed when no Segments point to them any
+more.
+
+> Aha. So the range of a SegInfo will always be equal to or greater
+> than the range of its parent Segment? Or can you eg. mmap a whole
+> file plus some extra pages, and then the SegInfo won't cover the extra
+> part of the range?
+
+That would be unusual, but possible. You could imagine ld generating an
+ELF file via a mapping this way (which would probably upset Valgrind no
+end).
diff --git a/docs/internals/threads-syscalls-signals.txt b/docs/internals/threads-syscalls-signals.txt
new file mode 100644
index 0000000..5a083c1
--- /dev/null
+++ b/docs/internals/threads-syscalls-signals.txt
@@ -0,0 +1,300 @@
+
+/* Make a thread the running thread. The thread must previously been
+ sleeping, and not holding the CPU semaphore. This will set the
+ thread state to VgTs_Runnable, and the thread will attempt to take
+ the CPU semaphore. By the time it returns, tid will be the running
+ thread. */
+extern void VG_(set_running) ( ThreadId tid );
+
+/* Set a thread into a sleeping state. Before the call, the thread
+ must be runnable, and holding the CPU semaphore. When this call
+ returns, the thread will be set to the specified sleeping state,
+ and will not be holding the CPU semaphore. Note that another
+ thread could be running by the time this call returns, so the
+ caller must be careful not to touch any shared state. It is also
+ the caller's responsibility to actually block until the thread is
+ ready to run again. */
+extern void VG_(set_sleeping) ( ThreadId tid, ThreadStatus state );
+
+
+The master semaphore is run_sema in vg_scheduler.c.
+
+
+(what happens at a fork?)
+
+VG_(scheduler_init) registers sched_fork_cleanup as a child atfork
+handler. sched_fork_cleanup, among other things, reinitializes the
+semaphore with a new pipe so the process has its own.
+
+--------------------------------------------------------------------
+
+Re: New World signal handling
+From: Jeremy Fitzhardinge <jeremy@goop.org>
+To: Julian Seward <jseward@acm.org>
+Date: Mon Mar 14 09:03:51 2005
+
+Well, the big-picture things to be clear about are:
+
+ 1. signal handlers are process-wide global state
+ 2. signal masks are per-thread (there's no notion of a process-wide
+ signal mask)
+ 3. a signal can be targeted to either
+ 1. the whole process (any eligable thread is picked for
+ delivery), or
+ 2. a specific thread
+
+1 is why it is always a bug to temporarily reset a signal handler (say,
+for SIGSEGV), because if any other thread happens to be sent one in that
+window it will cause havok (I think there's still one instance of this
+in the symtab stuff).
+2 is the meat of your questions; more below.
+3 is responsible for some of the nitty detail in the signal stuff, so
+its worth bearing in mind to understand it all. (Note that even if a
+signal is targeting the whole process, its only ever delivered to one
+particular thread; there's no such thing as a broadcast signal.)
+
+While a thread are running core code or generated code, it has almost
+all its signals blocked (all but the fault signals: SEGV, BUS, ILL, etc).
+
+Every N basic blocks, each thread calls VG_(poll_signals) to see what
+signals are pending for it. poll_signals grabs the next pending signal
+which the client signal mask doesn't block, and sets it up for delivery;
+it uses the sigtimedwait() syscall to fetch blocked pending signals
+rather than have them delivered to a signal handler. This means that
+we avoid the complexity of having signals delivered asynchronously via
+the signal handlers; we can just poll for them synchronously when
+they're easy to deal with.
+
+Fault signals, being caused by a specific instruction, are the exception
+because they can't be held off; if they're blocked when an instruction
+raises one, the kernel will just summarily kill the process. Therefore,
+they need to be always unblocked, and the signal handler is called when
+an instruction raises one of these exceptions. (It's also necessary to
+call poll_signals after any syscall which may raise a signal, since
+signal-raising syscalls are considered to be synchronous with respect to
+their signal; ie, calling kill(getpid(), SIGUSR1) will call the handler
+for SIGUSR1 before kill is seen to complete.)
+
+The one time when the thread's real signal mask actually matches the
+client's requested signal mask is while running a blocking syscall. We
+have to set things up to accept signals during a syscall so that we get
+the right signal-interrupts-syscall semantics. The tricky part about
+this is that there's no general atomic
+set-signal-mask-and-block-in-syscall mechanism, so we need to fake it
+with the stuff in VGA_(_client_syscall)/VGA_(interrupted_syscall).
+These two basically form an explicit state machine, where the state
+variable is the instruction pointer, which allows it to determine what
+point the syscall got to when the async signal happens. By keeping the
+window where signals are actually unblocked very narrow, the number of
+possible states is pretty small.
+
+This is all quite nice because the kernel does almost all the work of
+determining which thread should get a signal, what the correct action
+for a syscall when it has been interrupted is, etc. Particularly nice
+is that we don't need to worry about all the queuing semantics, and the
+per-signal special cases (which is, roughly, signals 1-32 are not queued
+except when they are, and signals 33-64 are queued except when they aren't).
+
+BUT, there's another complexity: because the Unix signal mechanism has
+been overloaded to deal with two separate kinds of events (asynchronous
+signals raised by kill(), and synchronous faults raised by an
+instruction), we can't block a signal for one form and not the other.
+That is, because we have to leave SIGSEGV unblocked for faulting
+instructions, it also leaves us open to getting an async SIGSEGV sent
+with kill(pid, SIGSEGV).
+
+To handle this case, there's a small per-thread signal queue set up to
+deal with this case (I'm using tid 0's queue for "signals sent to the
+whole process" - a hack, I'll admit). If an async SIGSEGV (etc) signal
+appears, then it is pushed onto the appropriate queue.
+VG_(poll_signals) also checks these queues for pending signals to decide
+what signal to deliver next. These queues are only manipulated with
+*all* signals blocked, so there's no risk of two concurrent async signal
+handlers modifying the queues at once. Also, because the liklihood of
+actually being sent an async SIGSEGV is pretty low, the queues are only
+allocated on demand.
+
+
+
+There are two mechanisms to prevent disaster if multiple threads get
+signals concurrently. One is that a signal handler is set up to block a
+set of signals while the signal is being delivered. Valgrind's handlers
+block all signals, so there's no risk of a new signal being delivered to
+the same thread until the old handler has finished.
+
+The other is that if the thread which recieves the signal is not running
+(ie, doesn't hold the run_sema, which implies it must be waiting for a
+syscall to complete), then the signal handler will grab the run_sema
+before making any global state changes. Since the only time we can get
+an async signal asynchronously is during a blocking syscall, this should
+be all the time. (And since synchronous signals are always the result of
+running an instruction, we should already be holding run_sema.)
+
+
+Valgrind will occasionally generate signals for itself. These are always
+synchronous faults as a result instruction fetch or something an
+instruction did. The two mechanims are the synth_fault_* functions,
+which are used to signal a problem while fetching an instruction, or by
+getting generated code to call a helper which contains a fault-raising
+instruction (used to deal with illegal/unimplemented instructions and
+for instructions who's only job is to raise exceptions).
+
+That all explains how signals come in, but the second part is how they
+get delivered.
+
+The main function for this is VG_(deliver_signal). There are three cases:
+
+ 1. the process is ignoring the signal (SIG_IGN)
+ 2. the process is using the default handler (SIG_DFL)
+ 3. the process has a handler for the signal
+
+In general, VG_(deliver_signal) shouldn't be called for ignored signals;
+if it has been called, it assumes the ignore is being overridden (if an
+instruction gets a SEGV etc, SIG_IGN is ignored and treated as SIG_DFL).
+
+VG_(deliver_signal) handles the default handler case, and the
+client-specified signal handler case.
+
+The default handler case is relatively easy: the signal's default action
+is either Terminate, or Ignore. We can ignore Ignore.
+
+Terminate always kills the entire process; there's no such thing as a
+thread-specific signal death. Terminate comes in two forms: with
+coredump, or without. vg_default_action() will write a core file, and
+then will tell all the threads to start terminating; it then longjmps
+back to the current thread's scheduler loop. The scheduler loop will
+terminate immediately, and the master_tid thread will wait for all the
+others to exit before shutting down the process (this is the same
+mechanism as exit_group).
+
+Delivering a signal to a client-side handler modifys the thread state so
+that there's a signal frame on the stack, and the instruction pointer is
+pointing to the handler. The fiddly bit is that there are two
+completely different signal frame formats: old and RT. While in theory
+the exact shape of these frames on stack is abstracted, there are real
+programs which know exactly where various parts of the structures are on
+stack (most notably, g++'s exception throwing code), which is why it has
+to have two separate pieces of code for each frame format. Another
+tricky case is dealing with the client stack running out/overflowing
+while setting up the signal frame.
+
+Signal return is also interesting. There are two syscalls, sigreturn
+and rt_sigreturn, which a signal handler will use to resume execution.
+The client will call the right one for the frame it was passed, so the
+core doesn't need to track that state. The tricky part is moving the
+frame's register state back into the thread's state, particularly all
+the FPU state reformatting gunk. Also, *sigreturn checks for new
+pending signals after the old frame has been cleaned up, since there's a
+requirement that all deliverable pending signals are delivered before
+the mainline code makes progress. This means that a program could
+live-lock on signals, but that's what would happen running natively...
+
+Another thing to watch for: programs which unwind the stack (like gdb,
+or exception throwers) recognize the existence of a signal frame by
+looking at the code the return address points to: if it is one of the
+two specific signal return sequences, it knows its a signal frame.
+That's why the signal handler return address must point to a very
+specific set of instructions.
+
+
+What else. Ah, the two internal signals.
+
+SIGVGKILL is pretty straightforward: its just used to dislodge a thread
+from being blocked in a syscall, so that we can get the thread to
+terminate in a timely fashion.
+
+SIGVGCHLD is used by a thread to tell the master_tid that it has
+exited. However, the only time the master_tid cares about this is when
+it has already exited, and its waiting for everyone else to exit. If
+the master_tid hasn't exited, then this signal is ignored. It isn't
+enough to simply block it, because that will cause a pile of queued
+SIGVGCHLDs to build up, eventually clogging the kernel's signal delivery
+mechanism. If its unblocked and ignored, it doesn't interrupt syscalls
+and it doesn't accumulate.
+
+
+I hope that helps clarify things. And explain why there's so much stuff
+in there: it's tracking a very complex and arcane underlying set of
+machinery.
+
+ J
+
+--------------------------------------------------------------------
+
+>I've been seeing references to 'master thread' around the place.
+>What distinguishes the master thread from the rest? Where does
+>the requirement to have a master thread come from?
+>
+It used to be tid 1, but I had to generalize it.
+
+The master_tid isn't very special; its main job is at process shutdown.
+It waits for all the other threads to exit, and then produces all the
+final reports. Until it exits, it's just a normal thread, with no other
+responsibilities.
+
+The alternative to having a master thread would be to make whichever
+thread exits last be responsible for emitting all the output. That
+would work, but it would make the results a bit asynchronous (that is,
+if the main thread exits and the other hang around for a while, anyone
+waiting on the process would see it as having exited, but no results
+would have been produced).
+
+VG_(master_tid) is a varable to handle the case where a threaded program
+forks. In the first process, the master_tid will be 1. If that program
+creates a few threads, and then, say, thread 3 forks, the child process
+will have a single thread in it. In the child, master_tid will be 3.
+It was easier to make the master thread a variable than to try to work
+out how to rename thread 3 to 1 after a fork.
+
+ J
+
+--------------------------------------------------------------------
+
+Re: Fwd: Documentation of kernel's signal routing ?
+From: David Woodhouse <...>
+To: Julian Seward <jseward@acm.org>
+
+> Regarding sys_clone created threads. I have a vague idea that
+> there is a notion of 'thread group'. I further understand that if
+> one thread in a group calls sys_exit_group then all threads in that
+> group exit. Whereas if a thread calls sys_exit then just that
+> thread exits.
+>
+> I'm pretty hazy on this:
+
+Hmm, so am I :)
+
+> * Is the above correct?
+
+Yes, I believe so.
+
+> * How is thread-group membership defined/changed?
+
+By specifying CLONE_THREAD in the flags to clone(), you remain part of
+the same thread group as the parent. In a single-threaded process, the
+thread group id (tgid) is the same as the pid.
+
+Linux just has tasks, which sometimes happen to share VM -- and now with
+NPTL we also share other stuff like signals, etc. The 'pid' in Linux is
+what POSIX would call the 'thread id', and the 'tgid' in Linux is
+equivalent to the POSIX 'pid'.
+
+> * Do you know offhand how LinuxThreads and NPTL use thread groups?
+
+I believe that LT doesn't use the kernel's concept of thread groups at
+all. LT predates the kernel's support for proper POSIX-like sharing of
+anything much but memory, so uses only the CLONE_VM (and possibly
+CLONE_FILES) flags. I don't _think_ it uses CLONE_SIGHAND -- it does
+most of its work by propagating signals manually between threads.
+
+NTPL uses thread groups as generated by the CLONE_THREAD flag, which is
+what invokes the POSIX-related thread semantics.
+
+> Is it the case that each LinuxThreads threads is in its own
+> group whereas all NTPL threads [in a process] are in a single
+> group?
+
+Yes, that's my understanding.
+
+--
+dwmw2
diff --git a/docs/tm-mutexstates.dot b/docs/internals/tm-mutexstates.dot
similarity index 100%
rename from docs/tm-mutexstates.dot
rename to docs/internals/tm-mutexstates.dot
diff --git a/docs/tm-threadstates.dot b/docs/internals/tm-threadstates.dot
similarity index 100%
rename from docs/tm-threadstates.dot
rename to docs/internals/tm-threadstates.dot
diff --git a/docs/internals/tracking-fn-entry-exit.txt b/docs/internals/tracking-fn-entry-exit.txt
new file mode 100644
index 0000000..40270c8
--- /dev/null
+++ b/docs/internals/tracking-fn-entry-exit.txt
@@ -0,0 +1,205 @@
+
+This file describes in detail how Calltree accurately tracks function
+entry/exit, one of those harder-than-you'd-think things.
+
+-----------------------------------------------------------------------------
+Josef's description
+-----------------------------------------------------------------------------
+From: Josef Weidendorfer <Josef.Weidendorfer@gmx.de>
+To: Nicholas Nethercote <njn25@cam.ac.uk>
+Cc: valgrind-developers@lists.sourceforge.net
+Subject: [Valgrind-developers] Re: Tracking function entry/exit
+
+On Sunday 25 January 2004 16:53, Nicholas Nethercote wrote:
+> Josef,
+>
+> The topic of tracking function entry/exit has come up a few times on the
+> mailing lists recently. My usual answer is that it's difficult to do
+> correctly. However, you seem to do it with Calltree. I looked at the
+> source code a bit, and it looks like you are doing some reasonably
+> complicated things to get it right, eg. unwinding the stack. How robust
+> is your approach? Can you briefly explain how it works?
+
+A note before describing the mechanism: I need to have a helper call at start
+of every BB anyway, so I use this helper to do the tracking. This of course
+has some overhead, and perhaps can be avoided, but it seems to add to the
+robustness. I have a bug fix here for reentrent entering of a signal handler
+(2 bug reports). Otherwise I have no bug reports, so I assume that the
+mechanism to be quite robust.
+
+I have a shadow call stack for every thread. For signal handlers of a thread,
+I first PUSH a separation marker on the shadow stack, and use the stack as
+normal. The marker is used for unwinding when leaving the signal handler.
+This is fine as there is no scheduling among signal handlers of one thread.
+
+Instrumentation of calltree:
+* Store at the end of each basic block the jmpkind into a tool-global, static
+variable.
+* At the start of every BB, jump to a helper function.
+
+The helper function does the following regarding function call tracking:
+- for a control transfer to another ELF object/ELF section, override jmpkind
+ with a CALL (*1)
+- for a control transfer to the 1st basic block of a function, override
+ jmpkind with a CALL (*2)
+- do unwinding if needed (i.e, POPs of the shadow call stack)
+- if jmpkind is RET and there was no unwinding/POP:
+ - if our call stack is empty, simulate a CALL lasting from beginning
+ (with Valgrind 2.1.x, this is not needed any more, as we run on
+ simulated CPU from first client instruction)
+ - otherwise this is a JMP using a RET instruction (typically used in
+ the runtime linker). Do a POP, setting previous BB address to call
+ site and override jmpkind with a CALL. By this, you get 2 function
+ calls from a calling site.
+- when jmpkind is a CALL, push new function call from previous BB to current
+ BB on shadow call stack.
+- Save current BB address to be available for call to handler in next BB.
+
+Special care is needed at thread switches and enter/leave of signal handlers,
+as we need separate shadow call stacks.
+
+Known bug: We should check for the need of unwinding when ESP is explicitly
+written to. I hope this doesn't create too much overhead.
+
+Remarks:
+(*1) Jumps between ELF objects are function calls to a shared library. This is
+ mainly done to catch the JMP from PLT code.
+(*2) This is what your function tracking skin/tool does. It is needed here
+ mainly to catch tail recursion. In general, for functions doing a
+ "return otherfunction()", GCC produces JMPs with -O2.
+
+Additional points:
+- If I need a name for a function, but there is no debug info, I use the
+ instruction address minus the load offset of the corresponding ELF object
+ (if there is one) to get a relative address for that ELF object. This
+ offset can be used with objdump later in postprocessing tools (e.g.
+ objdump). I would suggest this change even for cachegrind instead of a
+ "???".
+- I introduced the ability to specify functions to be "skipped". This means
+ that execution of these functions is attributed to the calling function.
+ The default is to skip all functions located in PLT sections. Thus, in
+ effect, costs of PLT functions are attributed to callers, and the call to
+ a shared library function starts directly with code in the other ELF
+ object.
+- As Vg 2.1.x does pointerchecking, the instrumentation can't write to
+ memory space of Valgrind any longer. Currently, my tool needs
+ "--pointercheck=no" to be able to run. Jeremy and me already agreed on
+ replacing current LD/ST with a CLD/CST (Client Load/Store) with pointer
+ check and keep original LD/ST for tool usage without pointerchecking.
+
+Looking at these things, it seems possible to do function tracking at end of a
+basic block instead of the beginning of the next BB. This way, we can perhaps
+avoid calls to helpers at every BB.
+
+From my point of view, it would be great to integrate optional function
+tracking into Valgrind core with some hooks.
+
+Josef
+
+
+-----------------------------------------------------------------------------
+Josef's clarification of Nick's summary of Josef's description
+-----------------------------------------------------------------------------
+On Monday 21 June 2004 12:15, Nicholas Nethercote wrote:
+
+> I've paraphrased your description to help me understand it better, but I'm
+> still not quite clear on some points. I looked at the code, but found it
+> hard to understand. Could you help me? I've written my questions in
+> square brackets. Here's the description.
+>
+> --------
+>
+> Data structures:
+>
+> - have a shadow call stack for every thread
+> [not sure exactly what goes on this]
+
+That's the resizable array of struct _call_entry's.
+Probably most important for call tracking is the %ESP value
+directly after a CALL, and a pointer to some struct storing information
+about the call arc or the called function.
+
+The esp value is needed to be able to robustly unwind correctly at %esp
+changes with %esp > stored esp on shadow stack.
+
+> Action at BB start -- depends on jmp_kind from previous BB:
+>
+> - If jmp_kind is neither JmpCall nor JmpRet (ie. is JmpNone, JmpBoring,
+> JmpCond or JmpSyscall) and we transferred from one ELF object/section to
+> another, it must be a function call to a shared library -- treat as a
+> call. This catches jmps from PLT code.
+>
+> - If this is the first BB of a function, treat as a call. This catches
+> tail calls (which gcc uses for "return f()" with -O2).
+> [What if a function had a 'goto' back to its beginning? Would that be
+> interpreted as a call?]
+
+Yes. IMHO, there is no way to distinguish between optimized tail recursion
+using a jump and regular jumping. But as most functions need parameters on
+the stack, a normal jump will rarely jump to the first BB of a function,
+wouldn't it?
+
+> - Unwind the shadow call stack if necessary.
+> [when is "necessary"? If the real %esp > the shadow stack %esp?]
+
+Yes. Currently I do this at every BB boundary, but perhaps it should be
+checked at every %esp change. Then, OTOH, it would look strange to attribute
+instructions of one BB to different functions?
+
+> - If this is a function return and there was no shadow stack unwinding,
+> this must be a RET control transfer (typically used in the runtime
+> linker). Pop the shadow call stack, setting the previous BB address to
+> call site and override jmpkind with a CALL. By this, you get 2 function
+> calls from a calling site.
+> [I don't understand this... What is a "RET control transfer"? Why do
+> you end up with 2 function calls -- is that a bad thing?]
+
+If there is a RET instruction, this usually should unwind (i.e. leave a
+function) at least one entry of the shadow call stack. But this doesn't need
+to be the case, i.e. even after a RET, %esp could be lower or equal to the
+one on the shadow stack. E.g. suppose
+
+ PUSH addr
+ RET
+
+This is only another way of saying "JMP addr", and doesn't add/remove any
+stack frame at all.
+Now, if addr is (according to debug information) inside of another function,
+this is a JMP between functions, let's say from B to C. Suppose B was called
+from A, I generate a RETURN event to A and a CALL event from A to C in this
+case.
+
+> - If we're treating the control transfer as a call, push new function call
+> from previous BB to current BB on shadow call stack.
+> [when is this information used?]
+
+I meant: Append a struct call_entry to the shadow stack (together with the
+current %esp value). As I said before, the shadow stack is used for robust
+unwinding.
+
+> - Save current BB address to be available for call to handler in next BB.
+>
+>
+> Other actions:
+>
+> When entering a signal handler, first push a separation marker on the
+> thread's shadow stack, then use it as normal. The marker is used for
+> unwinding when leaving the signal handler. This is fine as there is no
+> scheduling among signal handlers of one thread.
+>
+> Special care is needed at thread switches and enter/leave of signal
+> handlers, as we need separate shadow call stacks.
+> [Do you mean "separate shadow call stacks for each thread"?]
+
+Yes.
+
+> What about stack switching -- does it cope with that? (Not that Valgrind
+> in general does...)
+
+No.
+If you could give me a hint how to do it, I would be pleased. The problem here
+IMHO is: How to distinguish among a stack switch and allocating a huge array
+on the stack?
+
+Josef
+
diff --git a/docs/internals/xml-output.txt b/docs/internals/xml-output.txt
new file mode 100644
index 0000000..ec8fa57
--- /dev/null
+++ b/docs/internals/xml-output.txt
@@ -0,0 +1,386 @@
+
+As of May 2005, Valgrind can produce its output in XML form. The
+intention is to provide an easily parsed, stable format which is
+suitable for GUIs to read.
+
+
+Design goals
+~~~~~~~~~~~~
+
+* Produce XML output which is easily parsed
+
+* Have a stable output format which does not change much over time, so
+ that investments in parser-writing by GUI developers is not lost as
+ new versions of Valgrind appear.
+
+* Have an extensive output format, so that future changes to the
+ format do not break backwards compatibility with existing parsers of
+ it.
+
+* Produce output in a form which suitable for both offline GUIs (run
+ all the way to the end, then examine output) and interactive GUIs
+ (parse XML incrementally, update display as we go).
+
+* Put as much information as possible into the XML and let the GUIs
+ decide what to show the user (a.k.a provide mechanism, not policy).
+
+* Make XML which is actually parseable by standard XML tools.
+
+
+How to use
+~~~~~~~~~~
+
+Run with flag --xml=yes. That's all. Note however several
+caveats.
+
+* At the present time only Memcheck is supported. The scheme extends
+ easily enough to cover Addrcheck and Helgrind if needed.
+
+* When XML output is selected, various other settings are made.
+ This is in order that the output format is more controlled.
+ The settings which are changed are:
+
+ - Suppression generation is disabled, as that would require user
+ input.
+
+ - Attaching to GDB is disabled for the same reason.
+
+ - The verbosity level is set to 1 (-v).
+
+ - Error limits are disabled. Usually if the program generates a lot
+ of errors, Valgrind slows down and eventually stops collecting
+ them. When outputting XML this is not the case.
+
+ - VEX emulation warnings are not shown.
+
+ - File descriptor leak checking is disabled. This could be
+ re-enabled at some future point.
+
+ - Maximum-detail leak checking is selected (--leak-check=full).
+
+
+The output format
+~~~~~~~~~~~~~~~~~
+For the most part this should be self descriptive. It is printed in a
+sort-of human-readable way for easy understanding. You may want to
+read the rest of this together with the results of "valgrind --xml=yes
+memcheck/tests/xml1" as an example.
+
+All tags are balanced: a <foo> tag is always closed by </foo>. Hence
+in the description that follows, mention of a tag <foo> implicitly
+means there is a matching closing tag </foo>.
+
+Symbols in CAPITALS are nonterminals in the grammar and are defined
+somewhere below. The root nonterminal is TOPLEVEL.
+
+The following nonterminals are not described further:
+ INT is a 64-bit signed decimal integer.
+ TEXT is arbitrary text.
+ HEX64 is a 64-bit hexadecimal number, with leading "0x".
+
+Text strings are escaped so as to remove the <, > and & characters
+which would otherwise mess up parsing. They are replaced respectively
+with the standard encodings "<", ">" and "&" respectively.
+Note this is not (yet) done throughout, only for function names in
+<frame>..</frame> tags-pairs.
+
+
+TOPLEVEL
+--------
+
+The first line output is always this:
+
+ <?xml version="1.0"?>
+
+All remaining output is contained within the tag-pair
+<valgrindoutput>.
+
+Inside that, the first entity is an indication of the protocol
+version. This is provided so that existing parsers can identify XML
+created by future versions of Valgrind merely by observing that the
+protocol version is one they don't understand. Hence TOPLEVEL is:
+
+ <?xml version="1.0"?>
+ <valgrindoutput>
+ <protocolversion>INT<protocolversion>
+ VERSION1STUFF
+ </valgrindoutput>
+
+The only currently defined protocol version number is 1. This
+document only defines protocol version 1.
+
+
+VERSION1STUFF
+-------------
+This is the main top-level construction. Roughly speaking, it
+contains a load of preamble, the errors from the run of the
+program, and the result of the final leak check. Hence the
+following in sequence:
+
+* Various preamble lines which give version info for the various
+ components. The text in them can be anything; it is not intended
+ for interpretation by the GUI:
+
+ <preamble>
+ <line>Misc version/copyright text</line> (zero or more of)
+ </preamble>
+
+* The PID of this process and of its parent:
+
+ <pid>INT</pid>
+ <ppid>INT</ppid>
+
+* The name of the tool being used:
+
+ <tool>TEXT</tool>
+
+* OPTIONALLY, if --log-file-qualifier=VAR flag was given:
+
+ <logfilequalifier> <var>VAR</var> <value>$VAR</value>
+ </logfilequalifier>
+
+ That is, both the name of the environment variable and its value
+ are given.
+
+* OPTIONALLY, if --xml-user-comment=STRING was given:
+
+ <usercomment>STRING</usercomment>
+
+ STRING is not escaped in any way, so that it itself may be a piece
+ of XML with arbitrary tags etc.
+
+* The program and args: first those pertaining to Valgrind itself, and
+ then those pertaining to the program to be run under Valgrind (the
+ client):
+
+ <args>
+ <vargv>
+ <exe>TEXT</exe>
+ <arg>TEXT</arg> (zero or more of)
+ </vargv>
+ <argv>
+ <exe>TEXT</exe>
+ <arg>TEXT</arg> (zero or more of)
+ </argv>
+ </args>
+
+* The following, indicating that the program has now started:
+
+ <status> <what>RUNNING</what>
+ <when>human-readable-time-string</when>
+ </status>
+
+* Zero or more of (either ERROR or ERRORCOUNTS).
+
+* The following, indicating that the program has now finished, and
+ that the wrapup (leak checking) is happening.
+
+ <status> <what>FINISHED</what>
+ <when>human-readable-time-string</when>
+ </status>
+
+* SUPPCOUNTS, indicating how many times each suppression was used.
+
+* Zero or more ERRORs, each of which is a complaint from the
+ leak checker.
+
+That's it.
+
+
+ERROR
+-----
+This shows an error, and is the most complex nonterminal. The format
+is as follows:
+
+ <error>
+ <unique>HEX64</unique>
+ <tid>INT</tid>
+ <kind>KIND</kind>
+ <what>TEXT</what>
+
+ optionally: <leakedbytes>INT</leakedbytes>
+ optionally: <leakedblocks>INT</leakedblocks>
+
+ STACK
+
+ optionally: <auxwhat>TEXT</auxwhat>
+ optionally: STACK
+
+ </error>
+
+* Each error contains a unique, arbitrary 64-bit hex number. This is
+ used to refer to the error in ERRORCOUNTS nonterminals (see below).
+
+* The <tid> tag indicates the Valgrind thread number. This value
+ is arbitrary but may be used to determine which threads produced
+ which errors (at least, the first instance of each error).
+
+* The <kind> tag specifies one of a small number of fixed error
+ types (enumerated below), so that GUIs may roughly categorise
+ errors by type if they want.
+
+* The <what> tag gives a human-understandable description of the
+ error.
+
+* For <kind> tags specifying a KIND of the form "Leak_*", the
+ optional <leakedbytes> and <leakedblocks> indicate the number of
+ bytes and blocks leaked by this error.
+
+* The primary STACK for this error, indicating where it occurred.
+
+* Some error types may have auxiliary information attached:
+
+ <auxwhat>TEXT</auxwhat> gives an auxiliary human-readable
+ description (usually of invalid addresses)
+
+ STACK gives an auxiliary stack (usually the allocation/free
+ point of a block). If this STACK is present then
+ <auxwhat>TEXT</auxwhat> will precede it.
+
+
+KIND
+----
+This is a small enumeration indicating roughly the nature of an error.
+The possible values are:
+
+ InvalidFree
+
+ free/delete/delete[] on an invalid pointer
+
+ MismatchedFree
+
+ free/delete/delete[] does not match allocation function
+ (eg doing new[] then free on the result)
+
+ InvalidRead
+
+ read of an invalid address
+
+ InvalidWrite
+
+ write of an invalid address
+
+ InvalidJump
+
+ jump to an invalid address
+
+ Overlap
+
+ args overlap other otherwise bogus in eg memcpy
+
+ InvalidMemPool
+
+ invalid mem pool specified in client request
+
+ UninitCondition
+
+ conditional jump/move depends on undefined value
+
+ UninitValue
+
+ other use of undefined value (primarily memory addresses)
+
+ SyscallParam
+
+ system call params are undefined or point to
+ undefined/unaddressible memory
+
+ ClientCheck
+
+ "error" resulting from a client check request
+
+ Leak_DefinitelyLost
+
+ memory leak; the referenced blocks are definitely lost
+
+ Leak_IndirectlyLost
+
+ memory leak; the referenced blocks are lost because all pointers
+ to them are also in leaked blocks
+
+ Leak_PossiblyLost
+
+ memory leak; only interior pointers to referenced blocks were
+ found
+
+ Leak_StillReachable
+
+ memory leak; pointers to un-freed blocks are still available
+
+
+STACK
+-----
+STACK indicates locations in the program being debugged. A STACK
+is one or more FRAMEs. The first is the innermost frame, the
+next its caller, etc.
+
+ <stack>
+ one or more FRAME
+ </stack>
+
+
+FRAME
+-----
+FRAME records a single program location:
+
+ <frame>
+ <ip>HEX64</ip>
+ optionally <obj>TEXT</obj>
+ optionally <fn>TEXT</fn>
+ optionally <dir>TEXT</dir>
+ optionally <file>TEXT</file>
+ optionally <line>INT</line>
+ </frame>
+
+Only the <ip> field is guaranteed to be present. It indicates a
+code ("instruction pointer") address.
+
+The optional fields, if present, appear in the order stated:
+
+* obj: gives the name of the ELF object containing the code address
+
+* fn: gives the name of the function containing the code address
+
+* dir: gives the source directory associated with the name specified
+ by <file>. Note the current implementation often does not
+ put anything useful in this field.
+
+* file: gives the name of the source file containing the code address
+
+* line: gives the line number in the source file
+
+
+ERRORCOUNTS
+-----------
+This specifies, for each error that has been so far presented,
+the number of occurrences of that error.
+
+ <errorcounts>
+ zero or more of
+ <pair> <count>INT</count> <unique>HEX64</unique> </pair>
+ </errorcounts>
+
+Each <pair> gives the current error count <count> for the error with
+unique tag </unique>. The counts do not have to give a count for each
+error so far presented - partial information is allowable.
+
+As at Valgrind rev 3793, error counts are only emitted at program
+termination. However, it is perfectly acceptable to periodically emit
+error counts as the program is running. Doing so would facilitate a
+GUI to dynamically update its error-count display as the program runs.
+
+
+SUPPCOUNTS
+----------
+A SUPPCOUNTS block appears exactly once, after the program terminates.
+It specifies the number of times each error-suppression was used.
+Suppressions not mentioned were used zero times.
+
+ <suppcounts>
+ zero or more of
+ <pair> <count>INT</count> <name>TEXT</name> </pair>
+ </suppcounts>
+
+The <name> is as specified in the suppression name fields in .supp
+files.
+