Redo the way cachegrind generates instrumentation code, so that it can
deal with any IR that happens to show up.  This makes it work on ppc32
and should fix occasionally-reported bugs on x86/amd64 where it bombs
due to having to deal with multiple date references in a single
instruction.

The new scheme is based around the idea of a queue of memory events
which are outstanding, in the sense that no IR has yet been generated
to do the relevant helper calls.  The presence of the queue --
currently 16 entries deep -- gives cachegrind more scope for combining
multiple memory references into a single helper function call.  As a
result it runs 3%-5% faster than the previous version, on x86.

This commit also changes the type of the tool interface function
'tool_discard_basic_block_info' and clarifies its meaning.  See
comments in include/pub_tool_tooliface.h.



git-svn-id: svn://svn.valgrind.org/valgrind/trunk@4903 a5019735-40e9-0310-863c-91ae7b9d1cf9
5 files changed