Push towards a final version for 3.2.0.


git-svn-id: svn://svn.valgrind.org/valgrind/trunk@5932 a5019735-40e9-0310-863c-91ae7b9d1cf9
diff --git a/NEWS b/NEWS
index 6508840..33a64d5 100644
--- a/NEWS
+++ b/NEWS
@@ -1,90 +1,99 @@
-Release 3.2.0 (19 May 2006)
+
+Release 3.2.0 (?? May 2006)
 ~~~~~~~~~~~~~~~~~~~~~~~~~~~
-3.2.0 is a feature release with a number of significant improvements:
-Performance (especially of Memcheck) is much improved, Addrcheck has been
-removed, Callgrind has been added, PPC64/Linux support has been added,
-Lackey has been improved, and MPI support has been added.  In detail:
+3.2.0 is a feature release with many significant improvements and the
+usual collection of bug fixes.  This release supports X86/Linux,
+AMD64/Linux, PPC32/Linux and PPC64/Linux.
 
-- Performance is much improved:  programs typically run 1.20--1.40 times
-  faster under Memcheck (much more for some unusual programs) with an
-  average of about 1.30 for the programs we tested it on.  The improvements
-  for Nulgrind are similar.  We haven't measured Cachegrind and Massif, they
-  should be also be faster, but with smaller improvements.  We are
-  interested to hear what improvements users get.
+Performance, especially of Memcheck, is much improved, Addrcheck has
+been removed, Callgrind has been added, PPC64/Linux support has been
+added, Lackey has been improved, and MPI support has been added.  In
+detail:
 
-  Also, Memcheck uses much less memory, due to the introduction of a
-  "compressed V bits" representation for Memcheck's shadow memory.  The
-  amount of shadow memory used -- which accounts for a large percentage of
-  Memcheck's memory overhead -- has been reduced by a factor of more than 4
-  on most programs.  This means you should be able to run programs that use
-  more memory than before without hitting problems.  This change in
-  representation also contributes to the speed improvements.
+- Memcheck has improved speed and reduced memory use.  Programs
+  typically run 20-40% faster, averaging about 30% for SPEC CPU2000.
+  There are smaller but noticeable speed improvements for the other
+  tools.  We are interested to hear what improvements users get.
 
-- Addrcheck has been removed.  It has not worked since version 2.4.0, and
-  with the speed and memory improvements to Memcheck it is no longer worth
-  having around.  If you liked using Addrcheck because it didn't give
-  undefined value errors, you can use the new Memcheck option
-  --undef-value-errors=no to obtain this behaviour.
+  Memcheck uses less memory, due to the introduction of a compressed
+  representation for Memcheck's shadow memory.  The space overhead has
+  been reduced by a factor of more than four on most programs.  This
+  means you should be able to run programs that use more memory than
+  before without hitting problems.
 
-- Josef Weidendorfer's popular Callgrind tool has been added.  [XXX:
-  more details] [XXX: say something about KCachegrind and why it has not
-  been folded in...  I guess because its development is quite independent]
+- Addrcheck has been removed.  It has not worked since version 2.4.0,
+  and the speed and memory improvements to Memcheck make it redundant.
+  If you liked using Addrcheck because it didn't give undefined value
+  errors, you can use the new Memcheck option --undef-value-errors=no
+  to get the same behaviour.
+
+- Further reduced rates of incorrectly reported undefined-value-errors
+  in Memcheck (it was already very low).  In particular, efforts have
+  been made to ensure Memcheck works really well with gcc
+  4.0/4.1-generated code on X86/Linux and AMD64/Linux.
+
+- Josef Weidendorfer's popular Callgrind tool has been added.  Folding
+  it in is in a logical step given its popularity and usefulness, and
+  makes it easier for us to ensure it works "out of the box" on all
+  supported targets.  The associated KDE KCachegrind GUI remains a
+  separate project.
 
 - Valgrind now works on PPC64/Linux.  As with the AMD64/Linux port,
-  this supports programs using to 32G of address space.  On
-  64-bit capable PPC64/Linux setups, you get a dual architecture
-  build so that both 32-bit and 64-bit executables can be run.
-  Linux on POWER5 is supported, and POWER4 is also believed to
-  work.  Both 32-bit and 64-bit DWARF2 is supported.  This port is
-  known to work well with both gcc-compiled and xlc/xlf-compiled code.
+  this supports programs using to 32G of address space.  On 64-bit
+  capable PPC64/Linux setups, you get a dual architecture build so
+  that both 32-bit and 64-bit executables can be run.  Linux on POWER5
+  is supported, and POWER4 is also believed to work.  Both 32-bit and
+  64-bit DWARF2 is supported.  This port is known to work well with
+  both gcc-compiled and xlc/xlf-compiled code.
 
-- Floating point accuracy has been improved for PPC32/Linux.  
-  Specifically, the floating point rounding mode is observed on all
-  FP arithmetic operations, and multiply-accumulate instructions are
-  preserved by the compilation pipeline.  This means you should
-  get FP results which are bit-for-bit identical to a native run.
-  These improvements are also present in the PPC64/Linux port.
+- Floating point accuracy has been improved for PPC32/Linux.
+  Specifically, the floating point rounding mode is observed on all FP
+  arithmetic operations, and multiply-accumulate instructions are
+  preserved by the compilation pipeline.  This means you should get FP
+  results which are bit-for-bit identical to a native run.  These
+  improvements are also present in the PPC64/Linux port.
 
 - Lackey, the example tool, has been improved:
 
-  * It has a new option --detailed-counts (off by default) which causes
-    it to print out a count of loads, stores and ALU operations done, and
-    their sizes.
+  * It has a new option --detailed-counts (off by default) which
+    causes it to print out a count of loads, stores and ALU operations
+    done, and their sizes.
 
-  * It has a new option --trace-mem (off by default) which causes it to
-    print out a trace of all memory accesses performed by a program.  It's a
-    good starting point for building Valgrind tools that need to track
-    memory accesses.  Read the comments at the top of the file
-    lackey/lk_main.c for details.
+  * It has a new option --trace-mem (off by default) which causes it
+    to print out a trace of all memory accesses performed by a
+    program.  It's a good starting point for building Valgrind tools
+    that need to track memory accesses.  Read the comments at the top
+    of the file lackey/lk_main.c for details.
 
-  * The original instrumentation (counting numbers of instructions, jumps,
-    etc) is now controlled by a new option --basic-counts.  It is on by
-    default.
+  * The original instrumentation (counting numbers of instructions,
+    jumps, etc) is now controlled by a new option --basic-counts.  It
+    is on by default.
 
 - MPI support: partial support for debugging distributed applications
-  using the MPI library specification has been added.  Valgrind is 
+  using the MPI library specification has been added.  Valgrind is
   aware of the memory state changes caused by a subset of the MPI
   functions, and will carefully check data passed to the (P)MPI_
   interface.
 
-- A new flag, --error-exitcode=, has been added.  This allows changing the
-  exit code in runs where Valgrind reported errors, which is useful when
-  using Valgrind as part of an automated test suite.
+- A new flag, --error-exitcode=, has been added.  This allows changing
+  the exit code in runs where Valgrind reported errors, which is
+  useful when using Valgrind as part of an automated test suite.
 
-- XXX: others...
+- Various segfaults when reading old-style "stabs" debug information
+  have been fixed.
 
-Please note that Helgrind is still not working.  We have made an important
-step towards making it work again, however, with the addition of function
-wrapping (see below).
+Please note that Helgrind is still not working.  We have made an
+important step towards making it work again, however, with the
+addition of function wrapping (see below).
 
 Other user-visible changes:
 
-- Valgrind now has the ability to intercept and wrap arbitrary functions.
-  This is a preliminary step towards making Helgrind work again, and
-  was required for MPI support.
+- Valgrind now has the ability to intercept and wrap arbitrary
+  functions.  This is a preliminary step towards making Helgrind work
+  again, and was required for MPI support.
 
-- There are some changes to Memcheck's client requests.  Some of them have
-  changed names:
+- There are some changes to Memcheck's client requests.  Some of them
+  have changed names:
 
     MAKE_NOACCESS  --> MAKE_MEM_NOACCESS
     MAKE_WRITABLE  --> MAKE_MEM_UNDEFINED
@@ -94,9 +103,9 @@
     CHECK_READABLE --> CHECK_MEM_IS_DEFINED
     CHECK_DEFINED  --> CHECK_VALUE_IS_DEFINED
 
-  The reason for the change is that the old names are subtly misleading.
-  The old names will still work, but they are deprecated and may be removed
-  in a future release.
+  The reason for the change is that the old names are subtly
+  misleading.  The old names will still work, but they are deprecated
+  and may be removed in a future release.
 
   We also added a new client request:
   
@@ -108,7 +117,46 @@
 
 BUGS FIXED:
 
-XXX
+108258   NPTL pthread cleanup handlers not called 
+117290   valgrind is sigKILL'd on startup
+117295   == 117290
+118703   m_signals.c:1427 Assertion 'tst->status == VgTs_WaitSys'
+118466   add %reg, %reg generates incorrect validity for bit 0
+123210   New: strlen from ld-linux on amd64
+123244   DWARF2 CFI reader: unhandled CFI instruction 0:18
+123248   syscalls in glibc-2.4: openat, fstatat, symlinkat
+123258   socketcall.recvmsg(msg.msg_iov[i] points to uninit
+123535   mremap(new_addr) requires MREMAP_FIXED in 4th arg
+123836   small typo in the doc
+124029   ppc compile failed: `vor' gcc 3.3.5
+124222   Segfault: @@don't know what type ':' is
+124475   ppc32: crash (syscall?) timer_settime()
+124499   amd64->IR: 0xF 0xE 0x48 0x85 (femms)
+124528   FATAL: aspacem assertion failed: segment_is_sane
+124697   vex x86->IR: 0xF 0x70 0xC9 0x0 (pshufw)
+124892   vex x86->IR: 0xF3 0xAE (REPx SCASB)
+126216   == 124892
+124808   ppc32: sys_sched_getaffinity() not handled
+n-i-bz   Very long stabs strings crash m_debuginfo
+n-i-bz   amd64->IR: 0x66 0xF 0xF5 (pmaddwd)
+125492   ppc32: support a bunch more syscalls
+121617   ppc32/64: coredumping gives assertion failure
+121814   Coregrind return error as exitcode patch
+126517   == 121814
+108528   NPTL pthread cleanup handlers not called 
+125607   amd64->IR: 0x66 0xF 0xA3 0x2 (btw etc)
+125651   amd64->IR: 0xF8 0x49 0xFF 0xE3 (clc?)
+126253   x86 movx is wrong
+126451   3.2 SVN doesn't work on ppc32 CPU's without FPU
+126217   increase # threads
+126243   vex x86->IR: popw mem
+126583   amd64->IR: 0x48 0xF 0xA4 0xC2 (shld $1,%rax,%rdx)
+126668  amd64->IR: 0x1C 0xFF (sbb $0xff,%al)
+126696  support for CDROMREADRAW ioctl and CDROMREADTOCENTRY fix
+126722  assertion: segment_is_sane at m_aspacemgr/aspacemgr.c:1624
+126938  bad checking for syscalls linkat, renameat, symlinkat
+
+(3.2.0: ?? May 2006, vex r??, valgrind r??)
 
 
 Release 3.1.1 (15 March 2006)