-----------------------------------------------------------------------------
overview
-----------------------------------------------------------------------------
Previously Valgrind had its own versions of malloc() et al that replaced
glibc's.  This is necessary for various reasons for Memcheck, but isn't needed,
and was actually detrimental, to some other skins.  I never managed to treat
this satisfactorily w.r.t the core/skin split.

Now I have.  If a skin needs to know about malloc() et al, it must provide its
own replacements.  But because this is not uncommon, the core provides a module
vg_replace_malloc.c which a skin can link with, which provides skeleton
definitions, to reduce the amount of work a skin must do.  The skeletons handle
the transfer of control from the simd CPU to the real CPU, and also the
--alignment, --sloppy-malloc and --trace-malloc options.  These skeleton
definitions subsequently call functions SK_(malloc), SK_(free), etc, which the
skin must define;  in these functions the skin can do the things it needs to do
about tracking heap blocks.

For skins that track extra info about malloc'd blocks -- previously done with
ShadowChunks -- there is a new file vg_hashtable.c that implements a
generic-ish hash table (using dodgy C-style inheritance using struct overlays)
which allows skins to continue doing this fairly easily.

Skins can also replace other functions too, eg. Memcheck has its own versions
of strcpy(), memcpy(), etc.

Overall, it's slightly more work now for skins that need to replace malloc(),
but other skins don't have to use Valgrind's malloc(), so they're getting a
"purer" program run, which is good, and most of the remaining rough edges from
the core/skin split have been removed.

-----------------------------------------------------------------------------
details
-----------------------------------------------------------------------------
Moved malloc() et al intercepts from vg_clientfuncs.c into vg_replace_malloc.c.
Skins can link to it if they want to replace malloc() and friends;  it does
some stuff then passes control to SK_(malloc)() et al which the skin must
define.  They can call VG_(cli_malloc)() and VG_(cli_free)() to do the actual
allocation/deallocation.  Redzone size for the client (the CLIENT arena) is
specified by the static variable VG_(vg_malloc_redzone_szB).
vg_replace_malloc.c thus represents a kind of "mantle" level service.

To get automake to build vg_replace_malloc.o, had to resort to a similar trick
as used for the demangler -- ask for a "no install" library (which is never
used) to be built from it.

Note that all malloc, calloc, realloc, builtin_new, builtin_vec_new, memalign
are now aware of --alignment, when running on simd CPU or real CPU.

This means the new_mem_heap, die_mem_heap, copy_mem_heap and ban_mem_heap
events no longer exist, since the core doesn't control malloc() any more, and
skins can watch for these events themselves.

This required moving all the ShadowChunk stuff out of the core, which meant
the sizeof_shadow_block ``need'' could be removed, yay -- it was a horrible
hack.  Now ShadowChunks are done with a generic HashTable type, in
vg_hashtable.c, which skins can "inherit from" (in a dodgy C-only fashion by
using structs with similar layouts).  Also, the free_list stuff was all moved
as a part of this.  Also, VgAllocKind was moved out of core into
Memcheck/Addrcheck and renamed MAC_AllocKind.

Moved these options out of core into vg_replace_malloc.c:
    --trace-malloc
    --sloppy-malloc
    --alignment

The alternative_free ``need'' could go, too, since Memcheck is now in complete
control of free(), yay -- another horribility.

The bad_free and free_mismatch events could go too, since they're now not
detected by core, yay -- yet another horribility.

Moved malloc() et al wrappers for Memcheck out of vg_clientmalloc.c into
mac_malloc_wrappers.c.  Helgrind has its own wrappers now too.

Introduced VG_USERREQ__CLIENT_CALL[123] client requests.  When a skin function
is operating on the simd CPU, this will call a given function and run it on the
real CPU.  The macros VG_NON_SIMD_CALL[123] in valgrind.h present a cleaner
interface to actually use.  Also introduce analogues of these that pass 'tst'
from the scheduler as the first arg to the called function -- needed for
MC_(client_malloc)() et al.

Fiddled with USERREQ_{MALLOC,FREE} etc. in vg_scheduler.c; they call
SK_({malloc,free})() which by default call VG_(cli_malloc)() -- can't call
glibc's malloc() here.  All the other default SK_(calloc)() etc. instantly
panic; there's a lock variable to ensure that the default SK_({malloc,free})()
are only called from the scheduler, which prevents a skin from forgetting to
override SK_({malloc,free})().  Got rid of the unused USERREQ_CALLOC,
USERREQ_BUILTIN_NEW, etc.

Moved special versions of strcpy/strlen, etc, memcpy() and memchr() into
mac_replace_strmem.c -- they are only necessary for memcheck, because the
hyper-optimised normal glibc versions confuse it, and for memcpy() etc. overlap
checking.

Also added dst/src overlap checks to strcpy(), memcpy(), strcat().  They are
reported not as proper errors, but just with single line warnings, as for silly
args to malloc() et al;  this is mainly because they're on the simulated CPU
and proper error handling would be a pain;  hopefully they're rare enough to
not be a problem.  The strcpy check is done after the copy, because it would
require counting the length of the string beforehand.  Also added strncpy() and
strncat(), which have overlap checks too.  Note that addrcheck doesn't do
overlap checking.

Put USERREQ__LOGMESSAGE in vg_skin.h to do the overlap check error messages.

After removing malloc() et al and strcpy() et al out of vg_clientfuncs.c, moved
the remaining three things (sigsuspend, VG_(__libc_freeres_wrapper),
__errno_location) into vg_intercept.c, since it contains things that run on the
simulated CPU too.  Removed vg_clientfuncs.c altogether.

Moved regression test "malloc3" out of corecheck into memcheck, since corecheck
no longer looks for silly (eg. negative) args to malloc().

Removed the m_eip, m_esp, m_ebp fields from the `Error' type.  They were being
set up, and then read immediately only once, only if GDB attachment was done.
So now they're just being held in local variables.  This saves 12 bytes per
Error.

Made replacement calloc() check for --sloppy-malloc;  previously it didn't.

Added "silly" negative size arg check to realloc(), it didn't have one.

Changed VG_(read_selfprocmaps)() so it can parse the file directly, or from a
previously read buffer.  Buffer can be filled with the new
VG_(read_selfprocmaps_contents)().  Using this at start-up to snapshot
/proc/self/maps before the skins do anything, and then parsing it once they
have done their setup stuff.  Skins can now safely call VG_(malloc)() in
SK_({pre,post}_clo_init)() without the mmap'd superblock erroneously being
identified as client memory.

Changed the --help usage message slightly, now divided into four sections: core
normal, skin normal, core debugging, skin debugging.  Changed the interface for
the command_line_options need slightly -- now two functions, VG_(print_usage)()
and VG_(print_debug_usage)(), and they do the printing themselves, instead of
just returning a string -- that's more flexible.

Removed DEBUG_CLIENTMALLOC code, it wasn't being used and was a pain.

Added a regression test testing leak suppressions (nanoleak_supp), and another
testing strcpy/memcpy/etc overlap warnings (overlap).

Also changed Addrcheck to link with the files shared with Memcheck, rather than
#including the .c files directly.

Commoned up a little more shared Addrcheck/Memcheck code, for the usage
message, and initialisation/finalisation.

Added a Bool param to VG_(unique_error)() dictating whether it should allow
GDB to be attached; for leak checks, because we don't want to attach GDB on
leak errors (causes seg faults).  A bit hacky, but it will do.

Had to change lots of the expected outputs from regression files now that
malloc() et al are in vg_replace_malloc.c rather than vg_clientfuncs.c.


git-svn-id: svn://svn.valgrind.org/valgrind/trunk@1524 a5019735-40e9-0310-863c-91ae7b9d1cf9
diff --git a/coregrind/vg_main.c b/coregrind/vg_main.c
index bb75f0b..69079b7 100644
--- a/coregrind/vg_main.c
+++ b/coregrind/vg_main.c
@@ -491,8 +491,6 @@
 Int    VG_(sanity_level)       = 1;
 Int    VG_(clo_verbosity)      = 1;
 Bool   VG_(clo_demangle)       = True;
-Bool   VG_(clo_sloppy_malloc)  = False;
-Int    VG_(clo_alignment)      = 4;
 Bool   VG_(clo_trace_children) = False;
 
 /* See big comment in vg_include.h for meaning of these three. */
@@ -509,7 +507,6 @@
 Bool   VG_(clo_trace_syscalls) = False;
 Bool   VG_(clo_trace_signals)  = False;
 Bool   VG_(clo_trace_symtab)   = False;
-Bool   VG_(clo_trace_malloc)   = False;
 Bool   VG_(clo_trace_sched)    = False;
 Int    VG_(clo_trace_pthread_level) = 0;
 ULong  VG_(clo_stop_after)     = 1000000000000000LL;
@@ -593,8 +590,6 @@
 "    --demangle=no|yes         automatically demangle C++ names? [yes]\n"
 "    --num-callers=<number>    show <num> callers in stack traces [4]\n"
 "    --error-limit=no|yes      stop showing new errors if too many? [yes]\n"
-"    --sloppy-malloc=no|yes    round malloc sizes to next word? [no]\n"
-"    --alignment=<number>      set minimum alignment of allocations [4]\n"
 "    --trace-children=no|yes   Valgrind-ise child processes? [no]\n"
 "    --run-libc-freeres=no|yes Free up glibc memory at exit? [yes]\n"
 "    --logfile-fd=<number>     file descriptor for messages [2=stderr]\n"
@@ -620,15 +615,18 @@
 "    --trace-syscalls=no|yes   show all system calls? [no]\n"
 "    --trace-signals=no|yes    show signal handling details? [no]\n"
 "    --trace-symtab=no|yes     show symbol table details? [no]\n"
-"    --trace-malloc=no|yes     show client malloc details? [no]\n"
 "    --trace-sched=no|yes      show thread scheduler details? [no]\n"
-"    --trace-pthread=none|some|all  show pthread event details? [no]\n"
+"    --trace-pthread=none|some|all  show pthread event details? [none]\n"
 "    --stop-after=<number>     switch to real CPU after executing\n"
 "                              <number> basic blocks [infinity]\n"
 "    --dump-error=<number>     show translation for basic block\n"
 "                              associated with <number>'th\n"
 "                              error context [0=don't show any]\n"
 "\n"
+"  %s skin debugging options:\n";
+
+   Char* usage3 =
+"\n"
 "  Extra options are read from env variable $VALGRIND_OPTS\n"
 "\n"
 "  Valgrind is Copyright (C) 2000-2002 Julian Seward\n"
@@ -642,10 +640,15 @@
    VG_(printf)(usage1, VG_(details).name);
    /* Don't print skin string directly for security, ha! */
    if (VG_(needs).command_line_options)
-      VG_(printf)("%s", SK_(usage)());
+      SK_(print_usage)();
    else
       VG_(printf)("    (none)\n");
-   VG_(printf)(usage2, VG_EMAIL_ADDR);
+   VG_(printf)(usage2, VG_(details).name);
+   if (VG_(needs).command_line_options)
+      SK_(print_debug_usage)();
+   else
+      VG_(printf)("    (none)\n");
+   VG_(printf)(usage3, VG_EMAIL_ADDR);
 
    VG_(shutdown_logging)();
    VG_(clo_log_to)     = VgLogTo_Fd;
@@ -707,11 +710,13 @@
    {
        UInt* sp;
 
-       /* Look for the stack segment by reading /proc/self/maps and
+       /* Look for the stack segment by parsing /proc/self/maps and
 	  looking for a section bracketing VG_(esp_at_startup) which
-	  has rwx permissions and no associated file. */
+	  has rwx permissions and no associated file.  Note that this uses
+          the /proc/self/maps contents read at the start of VG_(main)(),
+          and doesn't re-read /proc/self/maps. */
 
-       VG_(read_procselfmaps)( vg_findstack_callback );
+       VG_(read_procselfmaps)( vg_findstack_callback, /*read_from_file*/False );
 
        /* Now vg_foundstack_start and vg_foundstack_size
           should delimit the stack. */
@@ -890,14 +895,6 @@
       else if (VG_CLO_STREQ(argv[i], "--demangle=no"))
          VG_(clo_demangle) = False;
 
-      else if (VG_CLO_STREQ(argv[i], "--sloppy-malloc=yes"))
-         VG_(clo_sloppy_malloc) = True;
-      else if (VG_CLO_STREQ(argv[i], "--sloppy-malloc=no"))
-         VG_(clo_sloppy_malloc) = False;
-
-      else if (VG_CLO_STREQN(12, argv[i], "--alignment="))
-         VG_(clo_alignment) = (Int)VG_(atoll)(&argv[i][12]);
-
       else if (VG_CLO_STREQ(argv[i], "--trace-children=yes"))
          VG_(clo_trace_children) = True;
       else if (VG_CLO_STREQ(argv[i], "--trace-children=no"))
@@ -993,11 +990,6 @@
       else if (VG_CLO_STREQ(argv[i], "--trace-symtab=no"))
          VG_(clo_trace_symtab) = False;
 
-      else if (VG_CLO_STREQ(argv[i], "--trace-malloc=yes"))
-         VG_(clo_trace_malloc) = True;
-      else if (VG_CLO_STREQ(argv[i], "--trace-malloc=no"))
-         VG_(clo_trace_malloc) = False;
-
       else if (VG_CLO_STREQ(argv[i], "--trace-sched=yes"))
          VG_(clo_trace_sched) = True;
       else if (VG_CLO_STREQ(argv[i], "--trace-sched=no"))
@@ -1042,16 +1034,6 @@
    if (VG_(clo_verbosity < 0))
       VG_(clo_verbosity) = 0;
 
-   if (VG_(clo_alignment) < 4 
-       || VG_(clo_alignment) > 4096
-       || VG_(log2)( VG_(clo_alignment) ) == -1 /* not a power of 2 */) {
-      VG_(message)(Vg_UserMsg, "");
-      VG_(message)(Vg_UserMsg, 
-         "Invalid --alignment= setting.  "
-         "Should be a power of 2, >= 4, <= 4096.");
-      VG_(bad_option)("--alignment");
-   }
-
    if (VG_(clo_GDB_attach) && VG_(clo_trace_children)) {
       VG_(message)(Vg_UserMsg, "");
       VG_(message)(Vg_UserMsg, 
@@ -1149,7 +1131,7 @@
 
       /* Core details */
       VG_(message)(Vg_UserMsg,
-         "Using valgrind-%s, a program instrumentation system for x86-linux.",
+         "Using valgrind-%s, a program supervision framework for x86-linux.",
          VERSION);
       VG_(message)(Vg_UserMsg, 
          "Copyright (C) 2000-2002, and GNU GPL'd, by Julian Seward.");
@@ -1394,6 +1376,17 @@
       VG_(stack)[10000-1-i] = (UInt)(&VG_(stack)[10000-i-1]) ^ 0xABCD4321;
    }
 
+   /* Read /proc/self/maps into a buffer.  Must be before:
+      - SK_(pre_clo_init)(): so that if it calls VG_(malloc)(), any mmap'd
+        superblocks are not erroneously identified as being owned by the
+        client, which would be bad.
+      - init_memory(): that's where the buffer is parsed
+      - init_tt_tc(): so the anonymous mmaps for the translation table and
+        translation cache aren't identified as part of the client, which would
+        waste > 20M of virtual address space, and be bad.
+   */
+   VG_(read_procselfmaps_contents)();
+
    /* Hook to delay things long enough so we can get the pid and
       attach GDB in another shell. */
    if (0) {
@@ -1408,23 +1401,25 @@
       - process_cmd_line_options(): to register skin name and description,
         and turn on/off 'command_line_options' need
       - init_memory() (to setup memory event trackers).
-    */
+   */
    SK_(pre_clo_init)();
    VG_(sanity_check_needs)();
 
    /* Process Valgrind's command-line opts (from env var VG_ARGS). */
    process_cmd_line_options();
 
-   /* Do post command-line processing initialisation */
+   /* Do post command-line processing initialisation.  Must be before:
+      - vg_init_baseBlock(): to register any more helpers
+   */
    SK_(post_clo_init)();
 
-   /* Set up baseBlock offsets and copy the saved machine's state into it. 
-      Comes after SK_(post_clo_init) in case it registers helpers. */
+   /* Set up baseBlock offsets and copy the saved machine's state into it. */
    vg_init_baseBlock();
 
    /* Initialise the scheduler, and copy the client's state from
-      baseBlock into VG_(threads)[1].  This has to come before signal
-      initialisations. */
+      baseBlock into VG_(threads)[1].  Must be before:
+      - VG_(sigstartup_actions)()
+   */
    VG_(scheduler_init)();
 
    /* Initialise the signal handling subsystem, temporarily parking
@@ -1438,8 +1433,7 @@
    /* Start calibration of our RDTSC-based clock. */
    VG_(start_rdtsc_calibration)();
 
-   /* Must come after SK_(init) so memory handler accompaniments (eg.
-    * shadow memory) can be setup ok */
+   /* Parse /proc/self/maps to learn about startup segments. */
    VGP_PUSHCC(VgpInitMem);
    VG_(init_memory)();
    VGP_POPCC(VgpInitMem);
@@ -1453,10 +1447,7 @@
       we can. */
    VG_(end_rdtsc_calibration)();
 
-   /* This should come after init_memory_and_symbols(); otherwise the 
-      latter carefully sets up the permissions maps to cover the 
-      anonymous mmaps for the translation table and translation cache, 
-      which wastes > 20M of virtual address space. */
+   /* Initialise translation table and translation cache. */
    VG_(init_tt_tc)();
 
    if (VG_(clo_verbosity) == 1) {