-----------------------------------------------------------------------------
overview
-----------------------------------------------------------------------------
This commit introduces an optimisation that speeds up Memcheck by roughly
-3 -- 28%, and Addrcheck by 1 -- 36%, at least for the SPEC2000 benchmarks on
my 1400MHz Athlon.

Basic idea: that handling of A/V bit updates on %esp-adjustments was quite
sub-optimal -- for each "PUT ESP", a function was called that computed the
delta from the old and new ESPs, and then called a looping function to deal
with it.

Improvements:

  1. most of the time, the delta can be seen from the code.  So there's no need
     to compute it.

  2. when the delta is known, we can directly call a skin function to handle it.

  3. we can specialise for certain common cases (eg. +/- 4, 8, 12, 16, 32),
     including having unrolled loops for these.

This slightly bloats UCode because of setting up args for the call, and for
updating ESP in code (previously was done in the called C function).  Eg. for
`date' the code expansion ratio goes from 14.2 --> 14.6.  But it's much faster.

Note that skins don't have to use the specialised cases, they can just
define the ordinary case if they want;  the specialised cases are only used
if present.

-----------------------------------------------------------------------------
details
-----------------------------------------------------------------------------
Removed addrcheck/ac_common.c, put its (minimal) contents in ac_main.c.

Updated the major interface version, because this change isn't binary
compatible with the old core/skin interface.

Removed the hooks {new,die}_mem_stack_aligned, replaced with the better
{new,die}_mem_stack_{4,8,12,16,32}.  Still have the generic {die,new}_mem_stack
hooks.  These are called directly from UCode, thanks to a new pass that occurs
between instrumentation and register allocation (but only if the skin uses
these stack-adjustment hooks).  VG_(unknown_esp_update)() is called from UCode
for the generic case;  it determines if it's a stack switch, and calls the
generic {new,die}_stack_mem hooks accordingly.  This meant
synth_handle_esp_assignment() could be removed.

The new %esp-delta computation phase is in vg_translate.c.

In Memcheck and Addrcheck, added functions for updating the A and V bits of a
single aligned word and a single aligned doubleword.  These are called from the
specialised functions new_mem_stack_4, etc.  Could remove the one for the old
hooks new_mem_stack_aligned and die_mem_stack_aligned.

In mc_common.h, added a big macro containing the definitions of new_mem_stack_4
et al.  It's ``instantiated'' separately by Memcheck and Addrcheck.  The macro
is a bit klugey, but I did it that way because speed is vital for these
functions, so eg. a function pointer would have slowed things down.

Updated the built-in profiling events appropriately for the changes (removed
one old event, added a new one;  finding their names is left as an exercise for
the reader).

Fixed memory event profiling in {Addr,Mem}check, which had rotted.

A few other minor things.


git-svn-id: svn://svn.valgrind.org/valgrind/trunk@1510 a5019735-40e9-0310-863c-91ae7b9d1cf9
diff --git a/memcheck/mc_common.c b/memcheck/mc_common.c
index 821bdd6..deedf8e 100644
--- a/memcheck/mc_common.c
+++ b/memcheck/mc_common.c
@@ -501,13 +501,25 @@
 
    100  fpu_access_check_SLOWLY
    101  fpu_access_check_SLOWLY(byte loop)
+
+   110  new_mem_stack_4
+   111  new_mem_stack_8
+   112  new_mem_stack_12
+   113  new_mem_stack_16
+   114  new_mem_stack_32
+   115  new_mem_stack
+
+   120  die_mem_stack_4
+   121  die_mem_stack_8
+   122  die_mem_stack_12
+   123  die_mem_stack_16
+   124  die_mem_stack_32
+   125  die_mem_stack
 */
 
 #ifdef VG_PROFILE_MEMORY
 
-#define N_PROF_EVENTS 150
-
-extern UInt MC_(event_ctr)[N_PROF_EVENTS];
+UInt MC_(event_ctr)[N_PROF_EVENTS];
 
 void MC_(init_prof_mem) ( void )
 {
@@ -533,8 +545,6 @@
 void MC_(init_prof_mem) ( void ) { }
 void MC_(done_prof_mem) ( void ) { }
 
-#define PROF_EVENT(ev) /* */
-
 #endif
 
 /*------------------------------------------------------------*/
@@ -677,7 +687,6 @@
    }
 }
 
-
 /*--------------------------------------------------------------------*/
 /*--- end                                              mc_common.c ---*/
 /*--------------------------------------------------------------------*/