Add proper comments explaining the args for the tool instrumenation
function.
git-svn-id: svn://svn.valgrind.org/valgrind/trunk@6236 a5019735-40e9-0310-863c-91ae7b9d1cf9
diff --git a/include/pub_tool_tooliface.h b/include/pub_tool_tooliface.h
index e762bcc..3ce6b4d 100644
--- a/include/pub_tool_tooliface.h
+++ b/include/pub_tool_tooliface.h
@@ -74,7 +74,7 @@
/* Basic tool functions */
/* The tool_instrument function is passed as a callback to
- LibVEX_Translate. VgInstrumentClosure carries additional info
+ LibVEX_Translate. VgCallbackClosure carries additional info
which the instrumenter might like to know, but which is opaque to
Vex.
*/
@@ -93,14 +93,76 @@
// Instrument a basic block. Must be a true function, ie. the same
// input always results in the same output, because basic blocks
- // can be retranslated. Unless you're doing something really
- // strange... Note that orig_addr_noredir is not necessarily the
- // same as the address of the first instruction in the IR, due to
- // function redirection.
- IRBB*(*instrument)(VgCallbackClosure*,
- IRBB* bb_in,
- VexGuestLayout*, VexGuestExtents*,
- IRType gWordTy, IRType hWordTy),
+ // can be retranslated, unless you're doing something really
+ // strange. Anyway, the arguments. Mostly they are straightforward
+ // except for the distinction between redirected and non-redirected
+ // guest code addresses, which is important to understand.
+ //
+ // VgCallBackClosure* closure contains extra arguments passed
+ // from Valgrind to the instrumenter, which Vex doesn't know about.
+ // You are free to look inside this structure.
+ //
+ // * closure->tid is the ThreadId of the thread requesting the
+ // translation. Not sure why this is here; perhaps callgrind
+ // uses it.
+ //
+ // * closure->nraddr is the non-redirected guest address of the
+ // start of the translation. In other words, the translation is
+ // being constructed because the guest program jumped to
+ // closure->nraddr but no translation of it was found.
+ //
+ // * closure->readdr is the redirected guest address, from which
+ // the translation was really made.
+ //
+ // To clarify this, consider what happens when, in Memcheck, the
+ // first call to malloc() happens. The guest program will be
+ // trying to jump to malloc() in libc; hence ->nraddr will contain
+ // that address. However, Memcheck intercepts and replaces
+ // malloc, hence ->readdr will be the address of Memcheck's
+ // malloc replacement in
+ // coregrind/m_replacemalloc/vg_replacemalloc.c. It follows
+ // that the first IMark in the translation will be labelled as
+ // from ->readdr rather than ->nraddr.
+ //
+ // Since most functions are not redirected, the majority of the
+ // time ->nraddr will be the same as ->readdr. However, you
+ // cannot assume this: if your tool has metadata associated
+ // with code addresses it will get into deep trouble if it does
+ // make this assumption.
+ //
+ // IRBB* bb_in is the incoming bb to be instrumented, in flat IR
+ // form.
+ //
+ // VexGuestLayout* layout contains limited info on the layout of
+ // the guest state: where the stack pointer and program counter
+ // are, and which fields should be regarded as 'always defined'.
+ // Memcheck uses this.
+ //
+ // VexGuestExtents* vge points to a structure which states the
+ // precise byte ranges of original code from which this translation
+ // was made (there may be up to three different ranges involved).
+ // Note again that these are the real addresses from which the code
+ // came. And so it should be the case that closure->readdr is the
+ // same as vge->base[0]; indeed Cachegrind contains this assertion.
+ //
+ // Tools which associate shadow data with code addresses
+ // (cachegrind, callgrind) need to be particularly clear about
+ // whether they are making the association with redirected or
+ // non-redirected code addresses. Both approaches are viable
+ // but you do need to understand what's going on. See comments
+ // below on discard_basic_block_info().
+ //
+ // IRType gWordTy and IRType hWordTy contain the types of native
+ // words on the guest (simulated) and host (real) CPUs. They will
+ // by either Ity_I32 or Ity_I64. So far we have never built a
+ // cross-architecture Valgrind so they should always be the same.
+ //
+ IRBB*(*instrument)(VgCallbackClosure* closure,
+ IRBB* bb_in,
+ VexGuestLayout* layout,
+ VexGuestExtents* vge,
+ IRType gWordTy,
+ IRType hWordTy),
// Finish up, print out any results, etc. `exitcode' is program's exit
// code. The shadow can be found with VG_(get_exit_status_shadow)().
@@ -215,9 +277,10 @@
// - If info is being stored at a per-translation level, use orig_addr
// to identify which translation is being discarded. Each translation
// will be discarded exactly once.
- // This orig_addr will match the orig_addr which was passed to
- // to instrument() when this translation was made. Note that orig_addr
- // won't necessarily be the same as the first address in "extents".
+ // This orig_addr will match the closure->nraddr which was passed to
+ // to instrument() (see extensive comments above) when this
+ // translation was made. Note that orig_addr won't necessarily be
+ // the same as the first address in "extents".
// - If info is being stored at a per-instruction level, you can get
// the address range(s) being discarded by stepping through "extents".
// Note that any single instruction may belong to more than one