| <html> |
| <head> |
| <style type="text/css"> |
| body { background-color: #ffffff; |
| color: #000000; |
| font-family: Times, Helvetica, Arial; |
| font-size: 14pt} |
| h4 { margin-bottom: 0.3em} |
| code { color: #000000; |
| font-family: Courier; |
| font-size: 13pt } |
| pre { color: #000000; |
| font-family: Courier; |
| font-size: 13pt } |
| a:link { color: #0000C0; |
| text-decoration: none; } |
| a:visited { color: #0000C0; |
| text-decoration: none; } |
| a:active { color: #0000C0; |
| text-decoration: none; } |
| </style> |
| </head> |
| |
| <body bgcolor="#ffffff"> |
| |
| <a name="title"> </a> |
| <h1 align=center>Valgrind, snapshot 20020324</h1> |
| |
| <center> |
| <a href="mailto:jseward@acm.org">jseward@acm.org<br> |
| <a href="http://www.muraroa.demon.co.uk">http://www.muraroa.demon.co.uk</a><br> |
| Copyright © 2000-2002 Julian Seward |
| <p> |
| Valgrind is licensed under the GNU General Public License, |
| version 2<br> |
| An open-source tool for finding memory-management problems in |
| Linux-x86 executables. |
| </center> |
| |
| <p> |
| |
| <hr width="100%"> |
| <a name="contents"></a> |
| <h2>Contents of this manual</h2> |
| |
| <h4>1 <a href="#intro">Introduction</a></h4> |
| 1.1 <a href="#whatfor">What Valgrind is for</a><br> |
| 1.2 <a href="#whatdoes">What it does with your program</a> |
| |
| <h4>2 <a href="#howtouse">How to use it, and how to make sense |
| of the results</a></h4> |
| 2.1 <a href="#starta">Getting started</a><br> |
| 2.2 <a href="#comment">The commentary</a><br> |
| 2.3 <a href="#report">Reporting of errors</a><br> |
| 2.4 <a href="#suppress">Suppressing errors</a><br> |
| 2.5 <a href="#flags">Command-line flags</a><br> |
| 2.6 <a href="#errormsgs">Explaination of error messages</a><br> |
| 2.7 <a href="#suppfiles">Writing suppressions files</a><br> |
| 2.8 <a href="#install">Building and installing</a><br> |
| 2.9 <a href="#problems">If you have problems</a><br> |
| |
| <h4>3 <a href="#machine">Details of the checking machinery</a></h4> |
| 3.1 <a href="#vvalue">Valid-value (V) bits</a><br> |
| 3.2 <a href="#vaddress">Valid-address (A) bits</a><br> |
| 3.3 <a href="#together">Putting it all together</a><br> |
| 3.4 <a href="#signals">Signals</a><br> |
| 3.5 <a href="#leaks">Memory leak detection</a><br> |
| |
| <h4>4 <a href="#limits">Limitations</a></h4> |
| |
| <h4>5 <a href="#howitworks">How it works -- a rough overview</a></h4> |
| 5.1 <a href="#startb">Getting started</a><br> |
| 5.2 <a href="#engine">The translation/instrumentation engine</a><br> |
| 5.3 <a href="#track">Tracking the status of memory</a><br> |
| 5.4 <a href="#sys_calls">System calls</a><br> |
| 5.5 <a href="#sys_signals">Signals</a><br> |
| |
| <h4>6 <a href="#example">An example</a></h4> |
| |
| <h4>7 <a href="techdocs.html">The design and implementation of Valgrind</a></h4> |
| |
| <hr width="100%"> |
| |
| <a name="intro"></a> |
| <h2>1 Introduction</h2> |
| |
| <a name="whatfor"></a> |
| <h3>1.1 What Valgrind is for</h3> |
| |
| Valgrind is a tool to help you find memory-management problems in your |
| programs. When a program is run under Valgrind's supervision, all |
| reads and writes of memory are checked, and calls to |
| malloc/new/free/delete are intercepted. As a result, Valgrind can |
| detect problems such as: |
| <ul> |
| <li>Use of uninitialised memory</li> |
| <li>Reading/writing memory after it has been free'd</li> |
| <li>Reading/writing off the end of malloc'd blocks</li> |
| <li>Reading/writing inappropriate areas on the stack</li> |
| <li>Memory leaks -- where pointers to malloc'd blocks are lost forever</li> |
| </ul> |
| |
| Problems like these can be difficult to find by other means, often |
| lying undetected for long periods, then causing occasional, |
| difficult-to-diagnose crashes. |
| |
| <p> |
| Valgrind is closely tied to details of the CPU, operating system and |
| to a less extent, compiler and basic C libraries. This makes it |
| difficult to make it portable, so I have chosen at the outset to |
| concentrate on what I believe to be a widely used platform: Red Hat |
| Linux 7.2, on x86s. I believe that it will work without significant |
| difficulty on other x86 GNU/Linux systems which use the 2.4 kernel and |
| GNU libc 2.2.X, for example SuSE 7.1 and Mandrake 8.0. Red Hat 6.2 is |
| also supported. It has worked in the past, and probably still does, |
| on RedHat 7.1 and 6.2. Note that I haven't compiled it on RedHat 7.1 |
| and 6.2 for a while, so they may no longer work now. |
| <p> |
| (Early Feb 02: after feedback from the KDE people it also works better |
| on other Linuxes). |
| <p> |
| At some point in the past, Valgrind has also worked on Red Hat 6.2 |
| (x86), thanks to the efforts of Rob Noble. |
| |
| <p> |
| Valgrind is licensed under the GNU General Public License, version |
| 2. Read the file LICENSE in the source distribution for details. |
| |
| <a name="whatdoes"> |
| <h3>1.2 What it does with your program</h3> |
| |
| Valgrind is designed to be as non-intrusive as possible. It works |
| directly with existing executables. You don't need to recompile, |
| relink, or otherwise modify, the program to be checked. Simply place |
| the word <code>valgrind</code> at the start of the command line |
| normally used to run the program. So, for example, if you want to run |
| the command <code>ls -l</code> on Valgrind, simply issue the |
| command: <code>valgrind ls -l</code>. |
| |
| <p>Valgrind takes control of your program before it starts. Debugging |
| information is read from the executable and associated libraries, so |
| that error messages can be phrased in terms of source code |
| locations. Your program is then run on a synthetic x86 CPU which |
| checks every memory access. All detected errors are written to a |
| log. When the program finishes, Valgrind searches for and reports on |
| leaked memory. |
| |
| <p>You can run pretty much any dynamically linked ELF x86 executable using |
| Valgrind. Programs run 25 to 50 times slower, and take a lot more |
| memory, than they usually would. It works well enough to run large |
| programs. For example, the Konqueror web browser from the KDE Desktop |
| Environment, version 2.1.1, runs slowly but usably on Valgrind. |
| |
| <p>Valgrind simulates every single instruction your program executes. |
| Because of this, it finds errors not only in your application but also |
| in all supporting dynamically-linked (.so-format) libraries, including |
| the GNU C library, the X client libraries, Qt, if you work with KDE, and |
| so on. That often includes libraries, for example the GNU C library, |
| which contain memory access violations, but which you cannot or do not |
| want to fix. |
| |
| <p>Rather than swamping you with errors in which you are not |
| interested, Valgrind allows you to selectively suppress errors, by |
| recording them in a suppressions file which is read when Valgrind |
| starts up. As supplied, Valgrind comes with a suppressions file |
| designed to give reasonable behaviour on Red Hat 7.2 (also 7.1 and |
| 6.2) when running text-only and simple X applications. |
| |
| <p><a href="#example">Section 6</a> shows an example of use. |
| <p> |
| <hr width="100%"> |
| |
| <a name="howtouse"></a> |
| <h2>2 How to use it, and how to make sense of the results</h2> |
| |
| <a name="starta"></a> |
| <h3>2.1 Getting started</h3> |
| |
| First off, consider whether it might be beneficial to recompile your |
| application and supporting libraries with optimisation disabled and |
| debugging info enabled (the <code>-g</code> flag). You don't have to |
| do this, but doing so helps Valgrind produce more accurate and less |
| confusing error reports. Chances are you're set up like this already, |
| if you intended to debug your program with GNU gdb, or some other |
| debugger. |
| |
| <p>Then just run your application, but place the word |
| <code>valgrind</code> in front of your usual command-line invokation. |
| Note that you should run the real (machine-code) executable here. If |
| your application is started by, for example, a shell or perl script, |
| you'll need to modify it to invoke Valgrind on the real executables. |
| Running such scripts directly under Valgrind will result in you |
| getting error reports pertaining to <code>/bin/sh</code>, |
| <code>/usr/bin/perl</code>, or whatever interpreter you're using. |
| This almost certainly isn't what you want and can be hugely confusing. |
| |
| <a name="comment"></a> |
| <h3>2.2 The commentary</h3> |
| |
| Valgrind writes a commentary, detailing error reports and other |
| significant events. The commentary goes to standard output by |
| default. This may interfere with your program, so you can ask for it |
| to be directed elsewhere. |
| |
| <p>All lines in the commentary are of the following form:<br> |
| <pre> |
| ==12345== some-message-from-Valgrind |
| </pre> |
| <p>The <code>12345</code> is the process ID. This scheme makes it easy |
| to distinguish program output from Valgrind commentary, and also easy |
| to differentiate commentaries from different processes which have |
| become merged together, for whatever reason. |
| |
| <p>By default, Valgrind writes only essential messages to the commentary, |
| so as to avoid flooding you with information of secondary importance. |
| If you want more information about what is happening, re-run, passing |
| the <code>-v</code> flag to Valgrind. |
| |
| |
| <a name="report"></a> |
| <h3>2.3 Reporting of errors</h3> |
| |
| When Valgrind detects something bad happening in the program, an error |
| message is written to the commentary. For example:<br> |
| <pre> |
| ==25832== Invalid read of size 4 |
| ==25832== at 0x8048724: BandMatrix::ReSize(int, int, int) (bogon.cpp:45) |
| ==25832== by 0x80487AF: main (bogon.cpp:66) |
| ==25832== by 0x40371E5E: __libc_start_main (libc-start.c:129) |
| ==25832== by 0x80485D1: (within /home/sewardj/newmat10/bogon) |
| ==25832== Address 0xBFFFF74C is not stack'd, malloc'd or free'd |
| </pre> |
| |
| <p>This message says that the program did an illegal 4-byte read of |
| address 0xBFFFF74C, which, as far as it can tell, is not a valid stack |
| address, nor corresponds to any currently malloc'd or free'd blocks. |
| The read is happening at line 45 of <code>bogon.cpp</code>, called |
| from line 66 of the same file, etc. For errors associated with an |
| identified malloc'd/free'd block, for example reading free'd memory, |
| Valgrind reports not only the location where the error happened, but |
| also where the associated block was malloc'd/free'd. |
| |
| <p>Valgrind remembers all error reports. When an error is detected, |
| it is compared against old reports, to see if it is a duplicate. If |
| so, the error is noted, but no further commentary is emitted. This |
| avoids you being swamped with bazillions of duplicate error reports. |
| |
| <p>If you want to know how many times each error occurred, run with |
| the <code>-v</code> option. When execution finishes, all the reports |
| are printed out, along with, and sorted by, their occurrence counts. |
| This makes it easy to see which errors have occurred most frequently. |
| |
| <p>Errors are reported before the associated operation actually |
| happens. For example, if you program decides to read from address |
| zero, Valgrind will emit a message to this effect, and the program |
| will then duly die with a segmentation fault. |
| |
| <p>In general, you should try and fix errors in the order that they |
| are reported. Not doing so can be confusing. For example, a program |
| which copies uninitialised values to several memory locations, and |
| later uses them, will generate several error messages. The first such |
| error message may well give the most direct clue to the root cause of |
| the problem. |
| |
| <a name="suppress"></a> |
| <h3>2.4 Suppressing errors</h3> |
| |
| Valgrind detects numerous problems in the base libraries, such as the |
| GNU C library, and the XFree86 client libraries, which come |
| pre-installed on your GNU/Linux system. You can't easily fix these, |
| but you don't want to see these errors (and yes, there are many!) So |
| Valgrind reads a list of errors to suppress at startup. By default |
| this file is <code>redhat72.supp</code>, located in the Valgrind |
| installation directory. |
| |
| <p>You can modify and add to the suppressions file at your leisure, or |
| write your own. Multiple suppression files are allowed. This is |
| useful if part of your project contains errors you can't or don't want |
| to fix, yet you don't want to continuously be reminded of them. |
| |
| <p>Each error to be suppressed is described very specifically, to |
| minimise the possibility that a suppression-directive inadvertantly |
| suppresses a bunch of similar errors which you did want to see. The |
| suppression mechanism is designed to allow precise yet flexible |
| specification of errors to suppress. |
| |
| <p>If you use the <code>-v</code> flag, at the end of execution, Valgrind |
| prints out one line for each used suppression, giving its name and the |
| number of times it got used. Here's the suppressions used by a run of |
| <code>ls -l</code>: |
| <pre> |
| --27579-- supp: 1 socketcall.connect(serv_addr)/__libc_connect/__nscd_getgrgid_r |
| --27579-- supp: 1 socketcall.connect(serv_addr)/__libc_connect/__nscd_getpwuid_r |
| --27579-- supp: 6 strrchr/_dl_map_object_from_fd/_dl_map_object |
| </pre> |
| |
| <a name="flags"></a> |
| <h3>2.5 Command-line flags</h3> |
| |
| You invoke Valgrind like this: |
| <pre> |
| valgrind [options-for-Valgrind] your-prog [options for your-prog] |
| </pre> |
| |
| <p>Valgrind's default settings succeed in giving reasonable behaviour |
| in most cases. Available options, in no particular order, are as |
| follows: |
| <ul> |
| <li><code>--help</code></li><br> |
| |
| <li><code>--version</code><br> |
| <p>The usual deal.</li><br><p> |
| |
| <li><code>-v --verbose</code><br> |
| <p>Be more verbose. Gives extra information on various aspects |
| of your program, such as: the shared objects loaded, the |
| suppressions used, the progress of the instrumentation engine, |
| and warnings about unusual behaviour. |
| </li><br><p> |
| |
| <li><code>-q --quiet</code><br> |
| <p>Run silently, and only print error messages. Useful if you |
| are running regression tests or have some other automated test |
| machinery. |
| </li><br><p> |
| |
| <li><code>--demangle=no</code><br> |
| <code>--demangle=yes</code> [the default] |
| <p>Disable/enable automatic demangling (decoding) of C++ names. |
| Enabled by default. When enabled, Valgrind will attempt to |
| translate encoded C++ procedure names back to something |
| approaching the original. The demangler handles symbols mangled |
| by g++ versions 2.X and 3.X. |
| |
| <p>An important fact about demangling is that function |
| names mentioned in suppressions files should be in their mangled |
| form. Valgrind does not demangle function names when searching |
| for applicable suppressions, because to do otherwise would make |
| suppressions file contents dependent on the state of Valgrind's |
| demangling machinery, and would also be slow and pointless. |
| </li><br><p> |
| |
| <li><code>--num-callers=<number></code> [default=4]<br> |
| <p>By default, Valgrind shows four levels of function call names |
| to help you identify program locations. You can change that |
| number with this option. This can help in determining the |
| program's location in deeply-nested call chains. Note that errors |
| are commoned up using only the top three function locations (the |
| place in the current function, and that of its two immediate |
| callers). So this doesn't affect the total number of errors |
| reported. |
| <p> |
| The maximum value for this is 50. Note that higher settings |
| will make Valgrind run a bit more slowly and take a bit more |
| memory, but can be useful when working with programs with |
| deeply-nested call chains. |
| </li><br><p> |
| |
| <li><code>--gdb-attach=no</code> [the default]<br> |
| <code>--gdb-attach=yes</code> |
| <p>When enabled, Valgrind will pause after every error shown, |
| and print the line |
| <br> |
| <code>---- Attach to GDB ? --- [Return/N/n/Y/y/C/c] ----</code> |
| <p> |
| Pressing <code>Ret</code>, or <code>N</code> <code>Ret</code> |
| or <code>n</code> <code>Ret</code>, causes Valgrind not to |
| start GDB for this error. |
| <p> |
| <code>Y</code> <code>Ret</code> |
| or <code>y</code> <code>Ret</code> causes Valgrind to |
| start GDB, for the program at this point. When you have |
| finished with GDB, quit from it, and the program will continue. |
| Trying to continue from inside GDB doesn't work. |
| <p> |
| <code>C</code> <code>Ret</code> |
| or <code>c</code> <code>Ret</code> causes Valgrind not to |
| start GDB, and not to ask again. |
| <p> |
| <code>--gdb-attach=yes</code> conflicts with |
| <code>--trace-children=yes</code>. You can't use them |
| together. Valgrind refuses to start up in this situation. |
| </li><br><p> |
| |
| <li><code>--partial-loads-ok=yes</code> [the default]<br> |
| <code>--partial-loads-ok=no</code> |
| <p>Controls how Valgrind handles word (4-byte) loads from |
| addresses for which some bytes are addressible and others |
| are not. When <code>yes</code> (the default), such loads |
| do not elicit an address error. Instead, the loaded V bytes |
| corresponding to the illegal addresses indicate undefined, and |
| those corresponding to legal addresses are loaded from shadow |
| memory, as usual. |
| <p> |
| When <code>no</code>, loads from partially |
| invalid addresses are treated the same as loads from completely |
| invalid addresses: an illegal-address error is issued, |
| and the resulting V bytes indicate valid data. |
| </li><br><p> |
| |
| <li><code>--sloppy-malloc=no</code> [the default]<br> |
| <code>--sloppy-malloc=yes</code> |
| <p>When enabled, all requests for malloc/calloc are rounded up |
| to a whole number of machine words -- in other words, made |
| divisible by 4. For example, a request for 17 bytes of space |
| would result in a 20-byte area being made available. This works |
| around bugs in sloppy libraries which assume that they can |
| safely rely on malloc/calloc requests being rounded up in this |
| fashion. Without the workaround, these libraries tend to |
| generate large numbers of errors when they access the ends of |
| these areas. Valgrind snapshots dated 17 Feb 2002 and later are |
| cleverer about this problem, and you should no longer need to |
| use this flag. |
| </li><br><p> |
| |
| <li><code>--trace-children=no</code> [the default]</br> |
| <code>--trace-children=yes</code> |
| <p>When enabled, Valgrind will trace into child processes. This |
| is confusing and usually not what you want, so is disabled by |
| default.</li><br><p> |
| |
| <li><code>--freelist-vol=<number></code> [default: 1000000] |
| <p>When the client program releases memory using free (in C) or |
| delete (C++), that memory is not immediately made available for |
| re-allocation. Instead it is marked inaccessible and placed in |
| a queue of freed blocks. The purpose is to delay the point at |
| which freed-up memory comes back into circulation. This |
| increases the chance that Valgrind will be able to detect |
| invalid accesses to blocks for some significant period of time |
| after they have been freed. |
| <p> |
| This flag specifies the maximum total size, in bytes, of the |
| blocks in the queue. The default value is one million bytes. |
| Increasing this increases the total amount of memory used by |
| Valgrind but may detect invalid uses of freed blocks which would |
| otherwise go undetected.</li><br><p> |
| |
| <li><code>--logfile-fd=<number></code> [default: 2, stderr] |
| <p>Specifies the file descriptor on which Valgrind communicates |
| all of its messages. The default, 2, is the standard error |
| channel. This may interfere with the client's own use of |
| stderr. To dump Valgrind's commentary in a file without using |
| stderr, something like the following works well (sh/bash |
| syntax):<br> |
| <code> |
| valgrind --logfile-fd=9 my_prog 9> logfile</code><br> |
| That is: tell Valgrind to send all output to file descriptor 9, |
| and ask the shell to route file descriptor 9 to "logfile". |
| </li><br><p> |
| |
| <li><code>--suppressions=<filename></code> [default: |
| /installation/directory/redhat72.supp] <p>Specifies an extra |
| file from which to read descriptions of errors to suppress. You |
| may use as many extra suppressions files as you |
| like.</li><br><p> |
| |
| <li><code>--leak-check=no</code> [default]<br> |
| <code>--leak-check=yes</code> |
| <p>When enabled, search for memory leaks when the client program |
| finishes. A memory leak means a malloc'd block, which has not |
| yet been free'd, but to which no pointer can be found. Such a |
| block can never be free'd by the program, since no pointer to it |
| exists. Leak checking is disabled by default |
| because it tends to generate dozens of error messages. |
| </li><br><p> |
| |
| <li><code>--show-reachable=no</code> [default]<br> |
| <code>--show-reachable=yes</code> <p>When disabled, the memory |
| leak detector only shows blocks for which it cannot find a |
| pointer to at all, or it can only find a pointer to the middle |
| of. These blocks are prime candidates for memory leaks. When |
| enabled, the leak detector also reports on blocks which it could |
| find a pointer to. Your program could, at least in principle, |
| have freed such blocks before exit. Contrast this to blocks for |
| which no pointer, or only an interior pointer could be found: |
| they are more likely to indicate memory leaks, because |
| you do not actually have a pointer to the start of the block |
| which you can hand to free(), even if you wanted to. |
| </li><br><p> |
| |
| <li><code>--leak-resolution=low</code> [default]<br> |
| <code>--leak-resolution=med</code> <br> |
| <code>--leak-resolution=high</code> |
| <p>When doing leak checking, determines how willing Valgrind is |
| to consider different backtraces the same. When set to |
| <code>low</code>, the default, only the first two entries need |
| match. When <code>med</code>, four entries have to match. When |
| <code>high</code>, all entries need to match. |
| <p> |
| For hardcore leak debugging, you probably want to use |
| <code>--leak-resolution=high</code> together with |
| <code>--num-callers=40</code> or some such large number. Note |
| however that this can give an overwhelming amount of |
| information, which is why the defaults are 4 callers and |
| low-resolution matching. |
| <p> |
| Note that the <code>--leak-resolution=</code> setting does not |
| affect Valgrind's ability to find leaks. It only changes how |
| the results are presented to you. |
| </li><br><p> |
| |
| <li><code>--workaround-gcc296-bugs=no</code> [default]<br> |
| <code>--workaround-gcc296-bugs=yes</code> <p>When enabled, |
| assume that reads and writes some small distance below the stack |
| pointer <code>%esp</code> are due to bugs in gcc 2.96, and does |
| not report them. The "small distance" is 256 bytes by default. |
| Note that gcc 2.96 is the default compiler on some popular Linux |
| distributions (RedHat 7.X, Mandrake) and so you may well need to |
| use this flag. Do not use it if you do not have to, as it can |
| cause real errors to be overlooked. A better option is to use a |
| gcc/g++ which works properly; 2.95.3 seems to be a good choice. |
| <p> |
| Unfortunately (27 Feb 02) it looks like g++ 3.0.4 is similarly |
| buggy, so you may need to issue this flag if you use 3.0.4. |
| </li><br><p> |
| |
| <li><code>--client-perms=no</code> [default]<br> |
| <code>--client-perms=yes</code> <p>An experimental feature. |
| <p> |
| When enabled, and when <code>--instrument=yes</code> (which is |
| the default), Valgrind honours client directives to set and |
| query address range permissions. This allows the client program |
| to tell Valgrind about changes in memory range permissions that |
| Valgrind would not otherwise know about, and so allows clients |
| to get Valgrind to do arbitrary custom checks. |
| <p> |
| Clients need to include the header file <code>valgrind.h</code> |
| to make this work. The macros therein have the magical property |
| that they generate code in-line which Valgrind can spot. |
| However, the code does nothing when not run on Valgrind, so you |
| are not forced to run your program on Valgrind just because you |
| use the macros in this file. |
| <p> |
| A brief description of the available macros: |
| <ul> |
| <li><code>VALGRIND_MAKE_NOACCESS</code>, |
| <code>VALGRIND_MAKE_WRITABLE</code> and |
| <code>VALGRIND_MAKE_READABLE</code>. These mark address |
| ranges as completely inaccessible, accessible but containing |
| undefined data, and accessible and containing defined data, |
| respectively. Subsequent errors may have their faulting |
| addresses described in terms of these blocks. Returns a |
| "block handle". |
| <p> |
| <li><code>VALGRIND_DISCARD</code>: At some point you may want |
| Valgrind to stop reporting errors in terms of the blocks |
| defined by the previous three macros. To do this, the above |
| macros return a small-integer "block handle". You can pass |
| this block handle to <code>VALGRIND_DISCARD</code>. After |
| doing so, Valgrind will no longer be able to relate |
| addressing errors to the user-defined block associated with |
| the handle. The permissions settings associated with the |
| handle remain in place; this just affects how errors are |
| reported, not whether they are reported. Returns 1 for an |
| invalid handle and 0 for a valid handle (although passing |
| invalid handles is harmless). |
| <p> |
| <li><code>VALGRIND_CHECK_NOACCESS</code>, |
| <code>VALGRIND_CHECK_WRITABLE</code> and |
| <code>VALGRIND_CHECK_READABLE</code>: check immediately |
| whether or not the given address range has the relevant |
| property, and if not, print an error message. Also, for the |
| convenience of the client, returns zero if the relevant |
| property holds; otherwise, the returned value is the address |
| of the first byte for which the property is not true. |
| <p> |
| <li><code>VALGRIND_CHECK_NOACCESS</code>: a quick and easy way |
| to find out whether Valgrind thinks a particular variable |
| (lvalue, to be precise) is addressible and defined. Prints |
| an error message if not. Returns no value. |
| <p> |
| <li><code>VALGRIND_MAKE_NOACCESS_STACK</code>: a highly |
| experimental feature. Similarly to |
| <code>VALGRIND_MAKE_NOACCESS</code>, this marks an address |
| range as inaccessible, so that subsequent accesses to an |
| address in the range gives an error. However, this macro |
| does not return a block handle. Instead, all annotations |
| created like this are reviewed at each client |
| <code>ret</code> (subroutine return) instruction, and those |
| which now define an address range block the client's stack |
| pointer register (<code>%esp</code>) are automatically |
| deleted. |
| <p> |
| In other words, this macro allows the client to tell |
| Valgrind about red-zones on its own stack. Valgrind |
| automatically discards this information when the stack |
| retreats past such blocks. Beware: hacky and flaky. |
| </ul> |
| </li> |
| <p> |
| As of 17 March 02 (the time of writing this), there is a small |
| problem with all of these macros, which is that I haven't |
| figured out how to make them produce sensible (always-succeeds) |
| return values when the client is run on the real CPU or on |
| Valgrind without <code>--client-perms=yes</code>. So if you |
| write client code which depends on the return values, be aware |
| that it may misbehave when not run with full Valgrindification. |
| If you always ignore the return values you should always be |
| safe. I plan to fix this. |
| </ul> |
| |
| There are also some options for debugging Valgrind itself. You |
| shouldn't need to use them in the normal run of things. Nevertheless: |
| |
| <ul> |
| |
| <li><code>--single-step=no</code> [default]<br> |
| <code>--single-step=yes</code> |
| <p>When enabled, each x86 insn is translated seperately into |
| instrumented code. When disabled, translation is done on a |
| per-basic-block basis, giving much better translations.</li><br> |
| <p> |
| |
| <li><code>--optimise=no</code><br> |
| <code>--optimise=yes</code> [default] |
| <p>When enabled, various improvements are applied to the |
| intermediate code, mainly aimed at allowing the simulated CPU's |
| registers to be cached in the real CPU's registers over several |
| simulated instructions.</li><br> |
| <p> |
| |
| <li><code>--instrument=no</code><br> |
| <code>--instrument=yes</code> [default] |
| <p>When disabled, the translations don't actually contain any |
| instrumentation.</li><br> |
| <p> |
| |
| <li><code>--cleanup=no</code><br> |
| <code>--cleanup=yes</code> [default] |
| <p>When enabled, various improvments are applied to the |
| post-instrumented intermediate code, aimed at removing redundant |
| value checks.</li><br> |
| <p> |
| |
| <li><code>--trace-syscalls=no</code> [default]<br> |
| <code>--trace-syscalls=yes</code> |
| <p>Enable/disable tracing of system call intercepts.</li><br> |
| <p> |
| |
| <li><code>--trace-signals=no</code> [default]<br> |
| <code>--trace-signals=yes</code> |
| <p>Enable/disable tracing of signal handling.</li><br> |
| <p> |
| |
| <li><code>--trace-symtab=no</code> [default]<br> |
| <code>--trace-symtab=yes</code> |
| <p>Enable/disable tracing of symbol table reading.</li><br> |
| <p> |
| |
| <li><code>--trace-malloc=no</code> [default]<br> |
| <code>--trace-malloc=yes</code> |
| <p>Enable/disable tracing of malloc/free (et al) intercepts. |
| </li><br> |
| <p> |
| |
| <li><code>--stop-after=<number></code> |
| [default: infinity, more or less] |
| <p>After <number> basic blocks have been executed, shut down |
| Valgrind and switch back to running the client on the real CPU. |
| </li><br> |
| <p> |
| |
| <li><code>--dump-error=<number></code> |
| [default: inactive] |
| <p>After the program has exited, show gory details of the |
| translation of the basic block containing the <number>'th |
| error context. When used with <code>--single-step=yes</code>, |
| can show the |
| exact x86 instruction causing an error.</li><br> |
| <p> |
| |
| <li><code>--smc-check=none</code><br> |
| <code>--smc-check=some</code> [default]<br> |
| <code>--smc-check=all</code> |
| <p>How carefully should Valgrind check for self-modifying code |
| writes, so that translations can be discarded? When |
| "none", no writes are checked. When "some", only writes |
| resulting from moves from integer registers to memory are |
| checked. When "all", all memory writes are checked, even those |
| with which are no sane program would generate code -- for |
| example, floating-point writes.</li> |
| </ul> |
| |
| |
| <a name="errormsgs"> |
| <h3>2.6 Explaination of error messages</h3> |
| |
| Despite considerable sophistication under the hood, Valgrind can only |
| really detect two kinds of errors, use of illegal addresses, and use |
| of undefined values. Nevertheless, this is enough to help you |
| discover all sorts of memory-management nasties in your code. This |
| section presents a quick summary of what error messages mean. The |
| precise behaviour of the error-checking machinery is described in |
| <a href="#machine">Section 4</a>. |
| |
| |
| <h4>2.6.1 Illegal read / Illegal write errors</h4> |
| For example: |
| <pre> |
| ==30975== Invalid read of size 4 |
| ==30975== at 0x40F6BBCC: (within /usr/lib/libpng.so.2.1.0.9) |
| ==30975== by 0x40F6B804: (within /usr/lib/libpng.so.2.1.0.9) |
| ==30975== by 0x40B07FF4: read_png_image__FP8QImageIO (kernel/qpngio.cpp:326) |
| ==30975== by 0x40AC751B: QImageIO::read() (kernel/qimage.cpp:3621) |
| ==30975== Address 0xBFFFF0E0 is not stack'd, malloc'd or free'd |
| </pre> |
| |
| <p>This happens when your program reads or writes memory at a place |
| which Valgrind reckons it shouldn't. In this example, the program did |
| a 4-byte read at address 0xBFFFF0E0, somewhere within the |
| system-supplied library libpng.so.2.1.0.9, which was called from |
| somewhere else in the same library, called from line 326 of |
| qpngio.cpp, and so on. |
| |
| <p>Valgrind tries to establish what the illegal address might relate |
| to, since that's often useful. So, if it points into a block of |
| memory which has already been freed, you'll be informed of this, and |
| also where the block was free'd at.. Likewise, if it should turn out |
| to be just off the end of a malloc'd block, a common result of |
| off-by-one-errors in array subscripting, you'll be informed of this |
| fact, and also where the block was malloc'd. |
| |
| <p>In this example, Valgrind can't identify the address. Actually the |
| address is on the stack, but, for some reason, this is not a valid |
| stack address -- it is below the stack pointer, %esp, and that isn't |
| allowed. |
| |
| <p>Note that Valgrind only tells you that your program is about to |
| access memory at an illegal address. It can't stop the access from |
| happening. So, if your program makes an access which normally would |
| result in a segmentation fault, you program will still suffer the same |
| fate -- but you will get a message from Valgrind immediately prior to |
| this. In this particular example, reading junk on the stack is |
| non-fatal, and the program stays alive. |
| |
| |
| <h4>2.6.2 Use of uninitialised values</h4> |
| For example: |
| <pre> |
| ==19146== Conditional jump or move depends on uninitialised value(s) |
| ==19146== at 0x402DFA94: _IO_vfprintf (_itoa.h:49) |
| ==19146== by 0x402E8476: _IO_printf (printf.c:36) |
| ==19146== by 0x8048472: main (tests/manuel1.c:8) |
| ==19146== by 0x402A6E5E: __libc_start_main (libc-start.c:129) |
| </pre> |
| |
| <p>An uninitialised-value use error is reported when your program uses |
| a value which hasn't been initialised -- in other words, is undefined. |
| Here, the undefined value is used somewhere inside the printf() |
| machinery of the C library. This error was reported when running the |
| following small program: |
| <pre> |
| int main() |
| { |
| int x; |
| printf ("x = %d\n", x); |
| } |
| </pre> |
| |
| <p>It is important to understand that your program can copy around |
| junk (uninitialised) data to its heart's content. Valgrind observes |
| this and keeps track of the data, but does not complain. A complaint |
| is issued only when your program attempts to make use of uninitialised |
| data. In this example, x is uninitialised. Valgrind observes the |
| value being passed to _IO_printf and thence to |
| _IO_vfprintf, but makes no comment. However, |
| _IO_vfprintf has to examine the value of x |
| so it can turn it into the corresponding ASCII string, and it is at |
| this point that Valgrind complains. |
| |
| <p>Sources of uninitialised data tend to be: |
| <ul> |
| <li>Local variables in procedures which have not been initialised, |
| as in the example above.</li><br><p> |
| |
| <li>The contents of malloc'd blocks, before you write something |
| there. In C++, the new operator is a wrapper round malloc, so |
| if you create an object with new, its fields will be |
| uninitialised until you fill them in, which is only Right and |
| Proper.</li> |
| </ul> |
| |
| |
| |
| <h4>2.6.3 Illegal frees</h4> |
| For example: |
| <pre> |
| ==7593== Invalid free() |
| ==7593== at 0x4004FFDF: free (ut_clientmalloc.c:577) |
| ==7593== by 0x80484C7: main (tests/doublefree.c:10) |
| ==7593== by 0x402A6E5E: __libc_start_main (libc-start.c:129) |
| ==7593== by 0x80483B1: (within tests/doublefree) |
| ==7593== Address 0x3807F7B4 is 0 bytes inside a block of size 177 free'd |
| ==7593== at 0x4004FFDF: free (ut_clientmalloc.c:577) |
| ==7593== by 0x80484C7: main (tests/doublefree.c:10) |
| ==7593== by 0x402A6E5E: __libc_start_main (libc-start.c:129) |
| ==7593== by 0x80483B1: (within tests/doublefree) |
| </pre> |
| <p>Valgrind keeps track of the blocks allocated by your program with |
| malloc/new, so it can know exactly whether or not the argument to |
| free/delete is legitimate or not. Here, this test program has |
| freed the same block twice. As with the illegal read/write errors, |
| Valgrind attempts to make sense of the address free'd. If, as |
| here, the address is one which has previously been freed, you wil |
| be told that -- making duplicate frees of the same block easy to spot. |
| |
| |
| <h4>2.6.4 Passing system call parameters with inadequate |
| read/write permissions</h4> |
| |
| Valgrind checks all parameters to system calls. If a system call |
| needs to read from a buffer provided by your program, Valgrind checks |
| that the entire buffer is addressible and has valid data, ie, it is |
| readable. And if the system call needs to write to a user-supplied |
| buffer, Valgrind checks that the buffer is addressible. After the |
| system call, Valgrind updates its administrative information to |
| precisely reflect any changes in memory permissions caused by the |
| system call. |
| |
| <p>Here's an example of a system call with an invalid parameter: |
| <pre> |
| #include <stdlib.h> |
| #include <unistd.h> |
| int main( void ) |
| { |
| char* arr = malloc(10); |
| (void) write( 1 /* stdout */, arr, 10 ); |
| return 0; |
| } |
| </pre> |
| |
| <p>You get this complaint ... |
| <pre> |
| ==8230== Syscall param write(buf) lacks read permissions |
| ==8230== at 0x4035E072: __libc_write |
| ==8230== by 0x402A6E5E: __libc_start_main (libc-start.c:129) |
| ==8230== by 0x80483B1: (within tests/badwrite) |
| ==8230== by <bogus frame pointer> ??? |
| ==8230== Address 0x3807E6D0 is 0 bytes inside a block of size 10 alloc'd |
| ==8230== at 0x4004FEE6: malloc (ut_clientmalloc.c:539) |
| ==8230== by 0x80484A0: main (tests/badwrite.c:6) |
| ==8230== by 0x402A6E5E: __libc_start_main (libc-start.c:129) |
| ==8230== by 0x80483B1: (within tests/badwrite) |
| </pre> |
| |
| <p>... because the program has tried to write uninitialised junk from |
| the malloc'd block to the standard output. |
| |
| |
| <h4>2.6.5 Warning messages you might see</h4> |
| |
| Most of these only appear if you run in verbose mode (enabled by |
| <code>-v</code>): |
| <ul> |
| <li> <code>More than 50 errors detected. Subsequent errors |
| will still be recorded, but in less detail than before.</code> |
| <br> |
| After 50 different errors have been shown, Valgrind becomes |
| more conservative about collecting them. It then requires only |
| the program counters in the top two stack frames to match when |
| deciding whether or not two errors are really the same one. |
| Prior to this point, the PCs in the top four frames are required |
| to match. This hack has the effect of slowing down the |
| appearance of new errors after the first 50. The 50 constant can |
| be changed by recompiling Valgrind. |
| <p> |
| <li> <code>More than 500 errors detected. I'm not reporting any more. |
| Final error counts may be inaccurate. Go fix your |
| program!</code> |
| <br> |
| After 500 different errors have been detected, Valgrind ignores |
| any more. It seems unlikely that collecting even more different |
| ones would be of practical help to anybody, and it avoids the |
| danger that Valgrind spends more and more of its time comparing |
| new errors against an ever-growing collection. As above, the 500 |
| number is a compile-time constant. |
| <p> |
| <li> <code>Warning: client exiting by calling exit(<number>). |
| Bye!</code> |
| <br> |
| Your program has called the <code>exit</code> system call, which |
| will immediately terminate the process. You'll get no exit-time |
| error summaries or leak checks. Note that this is not the same |
| as your program calling the ANSI C function <code>exit()</code> |
| -- that causes a normal, controlled shutdown of Valgrind. |
| <p> |
| <li> <code>Warning: client switching stacks?</code> |
| <br> |
| Valgrind spotted such a large change in the stack pointer, %esp, |
| that it guesses the client is switching to a different stack. |
| At this point it makes a kludgey guess where the base of the new |
| stack is, and sets memory permissions accordingly. You may get |
| many bogus error messages following this, if Valgrind guesses |
| wrong. At the moment "large change" is defined as a change of |
| more that 2000000 in the value of the %esp (stack pointer) |
| register. |
| <p> |
| <li> <code>Warning: client attempted to close Valgrind's logfile fd <number> |
| </code> |
| <br> |
| Valgrind doesn't allow the client |
| to close the logfile, because you'd never see any diagnostic |
| information after that point. If you see this message, |
| you may want to use the <code>--logfile-fd=<number></code> |
| option to specify a different logfile file-descriptor number. |
| <p> |
| <li> <code>Warning: noted but unhandled ioctl <number></code> |
| <br> |
| Valgrind observed a call to one of the vast family of |
| <code>ioctl</code> system calls, but did not modify its |
| memory status info (because I have not yet got round to it). |
| The call will still have gone through, but you may get spurious |
| errors after this as a result of the non-update of the memory info. |
| <p> |
| <li> <code>Warning: unblocking signal <number> due to |
| sigprocmask</code> |
| <br> |
| Really just a diagnostic from the signal simulation machinery. |
| This message will appear if your program handles a signal by |
| first <code>longjmp</code>ing out of the signal handler, |
| and then unblocking the signal with <code>sigprocmask</code> |
| -- a standard signal-handling idiom. |
| <p> |
| <li> <code>Warning: bad signal number <number> in __NR_sigaction.</code> |
| <br> |
| Probably indicates a bug in the signal simulation machinery. |
| <p> |
| <li> <code>Warning: set address range perms: large range <number></code> |
| <br> |
| Diagnostic message, mostly for my benefit, to do with memory |
| permissions. |
| </ul> |
| |
| |
| <a name="suppfiles"></a> |
| <h3>2.7 Writing suppressions files</h3> |
| |
| A suppression file describes a bunch of errors which, for one reason |
| or another, you don't want Valgrind to tell you about. Usually the |
| reason is that the system libraries are buggy but unfixable, at least |
| within the scope of the current debugging session. Multiple |
| suppresions files are allowed. By default, Valgrind uses |
| <code>linux24.supp</code> in the directory where it is installed. |
| |
| <p> |
| You can ask to add suppressions from another file, by specifying |
| <code>--suppressions=/path/to/file.supp</code>. |
| |
| <p>Each suppression has the following components:<br> |
| <ul> |
| |
| <li>Its name. This merely gives a handy name to the suppression, by |
| which it is referred to in the summary of used suppressions |
| printed out when a program finishes. It's not important what |
| the name is; any identifying string will do. |
| <p> |
| |
| <li>The nature of the error to suppress. Either: |
| <code>Value1</code>, |
| <code>Value2</code>, |
| <code>Value4</code> or |
| <code>Value8</code>, |
| meaning an uninitialised-value error when |
| using a value of 1, 2, 4 or 8 bytes. |
| Or |
| <code>Cond</code> (or its old name, <code>Value0</code>), |
| meaning use of an uninitialised CPU condition code. Or: |
| <code>Addr1</code>, |
| <code>Addr2</code>, |
| <code>Addr4</code> or |
| <code>Addr8</code>, meaning an invalid address during a |
| memory access of 1, 2, 4 or 8 bytes respectively. Or |
| <code>Param</code>, |
| meaning an invalid system call parameter error. Or |
| <code>Free</code>, meaning an invalid or mismatching free.</li><br> |
| <p> |
| |
| <li>The "immediate location" specification. For Value and Addr |
| errors, is either the name of the function in which the error |
| occurred, or, failing that, the full path the the .so file |
| containing the error location. For Param errors, is the name of |
| the offending system call parameter. For Free errors, is the |
| name of the function doing the freeing (eg, <code>free</code>, |
| <code>__builtin_vec_delete</code>, etc)</li><br> |
| <p> |
| |
| <li>The caller of the above "immediate location". Again, either a |
| function or shared-object name.</li><br> |
| <p> |
| |
| <li>Optionally, one or two extra calling-function or object names, |
| for greater precision.</li> |
| </ul> |
| |
| <p> |
| Locations may be either names of shared objects or wildcards matching |
| function names. They begin <code>obj:</code> and <code>fun:</code> |
| respectively. Function and object names to match against may use the |
| wildcard characters <code>*</code> and <code>?</code>. |
| |
| A suppression only suppresses an error when the error matches all the |
| details in the suppression. Here's an example: |
| <pre> |
| { |
| __gconv_transform_ascii_internal/__mbrtowc/mbtowc |
| Value4 |
| fun:__gconv_transform_ascii_internal |
| fun:__mbr*toc |
| fun:mbtowc |
| } |
| </pre> |
| |
| <p>What is means is: suppress a use-of-uninitialised-value error, when |
| the data size is 4, when it occurs in the function |
| <code>__gconv_transform_ascii_internal</code>, when that is called |
| from any function of name matching <code>__mbr*toc</code>, |
| when that is called from |
| <code>mbtowc</code>. It doesn't apply under any other circumstances. |
| The string by which this suppression is identified to the user is |
| __gconv_transform_ascii_internal/__mbrtowc/mbtowc. |
| |
| <p>Another example: |
| <pre> |
| { |
| libX11.so.6.2/libX11.so.6.2/libXaw.so.7.0 |
| Value4 |
| obj:/usr/X11R6/lib/libX11.so.6.2 |
| obj:/usr/X11R6/lib/libX11.so.6.2 |
| obj:/usr/X11R6/lib/libXaw.so.7.0 |
| } |
| </pre> |
| |
| <p>Suppress any size 4 uninitialised-value error which occurs anywhere |
| in <code>libX11.so.6.2</code>, when called from anywhere in the same |
| library, when called from anywhere in <code>libXaw.so.7.0</code>. The |
| inexact specification of locations is regrettable, but is about all |
| you can hope for, given that the X11 libraries shipped with Red Hat |
| 7.2 have had their symbol tables removed. |
| |
| <p>Note -- since the above two examples did not make it clear -- that |
| you can freely mix the <code>obj:</code> and <code>fun:</code> |
| styles of description within a single suppression record. |
| |
| |
| <a name="install"></a> |
| <h3>2.8 Building and installing</h3> |
| At the moment, very rudimentary. |
| |
| <p>The tarball is set up for a standard Red Hat 7.1 (6.2) machine. To |
| build, just do "make". No configure script, no autoconf, no nothing. |
| |
| <p>The files needed for installation are: valgrind.so, valgring.so, |
| valgrind, VERSION, redhat72.supp (or redhat62.supp). You can copy |
| these to any directory you like. However, you then need to edit the |
| shell script "valgrind". On line 4, set the environment variable |
| <code>VALGRIND</code> to point to the directory you have copied the |
| installation into. |
| |
| |
| <a name="problems"></a> |
| <h3>2.9 If you have problems</h3> |
| Mail me (<a href="mailto:jseward@acm.org">jseward@acm.org</a>). |
| |
| <p>See <a href="#limits">Section 4</a> for the known limitations of |
| Valgrind, and for a list of programs which are known not to work on |
| it. |
| |
| <p>The translator/instrumentor has a lot of assertions in it. They |
| are permanently enabled, and I have no plans to disable them. If one |
| of these breaks, please mail me! |
| |
| <p>If you get an assertion failure on the expression |
| <code>chunkSane(ch)</code> in <code>vg_free()</code> in |
| <code>vg_malloc.c</code>, this may have happened because your program |
| wrote off the end of a malloc'd block, or before its beginning. |
| Valgrind should have emitted a proper message to that effect before |
| dying in this way. This is a known problem which I should fix. |
| <p> |
| |
| <hr width="100%"> |
| |
| <a name="machine"></a> |
| <h2>3 Details of the checking machinery</h2> |
| |
| Read this section if you want to know, in detail, exactly what and how |
| Valgrind is checking. |
| |
| <a name="vvalue"></a> |
| <h3>3.1 Valid-value (V) bits</h3> |
| |
| It is simplest to think of Valgrind implementing a synthetic Intel x86 |
| CPU which is identical to a real CPU, except for one crucial detail. |
| Every bit (literally) of data processed, stored and handled by the |
| real CPU has, in the synthetic CPU, an associated "valid-value" bit, |
| which says whether or not the accompanying bit has a legitimate value. |
| In the discussions which follow, this bit is referred to as the V |
| (valid-value) bit. |
| |
| <p>Each byte in the system therefore has a 8 V bits which accompanies |
| it wherever it goes. For example, when the CPU loads a word-size item |
| (4 bytes) from memory, it also loads the corresponding 32 V bits from |
| a bitmap which stores the V bits for the process' entire address |
| space. If the CPU should later write the whole or some part of that |
| value to memory at a different address, the relevant V bits will be |
| stored back in the V-bit bitmap. |
| |
| <p>In short, each bit in the system has an associated V bit, which |
| follows it around everywhere, even inside the CPU. Yes, the CPU's |
| (integer) registers have their own V bit vectors. |
| |
| <p>Copying values around does not cause Valgrind to check for, or |
| report on, errors. However, when a value is used in a way which might |
| conceivably affect the outcome of your program's computation, the |
| associated V bits are immediately checked. If any of these indicate |
| that the value is undefined, an error is reported. |
| |
| <p>Here's an (admittedly nonsensical) example: |
| <pre> |
| int i, j; |
| int a[10], b[10]; |
| for (i = 0; i < 10; i++) { |
| j = a[i]; |
| b[i] = j; |
| } |
| </pre> |
| |
| <p>Valgrind emits no complaints about this, since it merely copies |
| uninitialised values from <code>a[]</code> into <code>b[]</code>, and |
| doesn't use them in any way. However, if the loop is changed to |
| <pre> |
| for (i = 0; i < 10; i++) { |
| j += a[i]; |
| } |
| if (j == 77) |
| printf("hello there\n"); |
| </pre> |
| then Valgrind will complain, at the <code>if</code>, that the |
| condition depends on uninitialised values. |
| |
| <p>Most low level operations, such as adds, cause Valgrind to |
| use the V bits for the operands to calculate the V bits for the |
| result. Even if the result is partially or wholly undefined, |
| it does not complain. |
| |
| <p>Checks on definedness only occur in two places: when a value is |
| used to generate a memory address, and where control flow decision |
| needs to be made. Also, when a system call is detected, valgrind |
| checks definedness of parameters as required. |
| |
| <p>If a check should detect undefinedness, and error message is |
| issued. The resulting value is subsequently regarded as well-defined. |
| To do otherwise would give long chains of error messages. In effect, |
| we say that undefined values are non-infectious. |
| |
| <p>This sounds overcomplicated. Why not just check all reads from |
| memory, and complain if an undefined value is loaded into a CPU register? |
| Well, that doesn't work well, because perfectly legitimate C programs routinely |
| copy uninitialised values around in memory, and we don't want endless complaints |
| about that. Here's the canonical example. Consider a struct |
| like this: |
| <pre> |
| struct S { int x; char c; }; |
| struct S s1, s2; |
| s1.x = 42; |
| s1.c = 'z'; |
| s2 = s1; |
| </pre> |
| |
| <p>The question to ask is: how large is <code>struct S</code>, in |
| bytes? An int is 4 bytes and a char one byte, so perhaps a struct S |
| occupies 5 bytes? Wrong. All (non-toy) compilers I know of will |
| round the size of <code>struct S</code> up to a whole number of words, |
| in this case 8 bytes. Not doing this forces compilers to generate |
| truly appalling code for subscripting arrays of <code>struct |
| S</code>'s. |
| |
| <p>So s1 occupies 8 bytes, yet only 5 of them will be initialised. |
| For the assignment <code>s2 = s1</code>, gcc generates code to copy |
| all 8 bytes wholesale into <code>s2</code> without regard for their |
| meaning. If Valgrind simply checked values as they came out of |
| memory, it would yelp every time a structure assignment like this |
| happened. So the more complicated semantics described above is |
| necessary. This allows gcc to copy <code>s1</code> into |
| <code>s2</code> any way it likes, and a warning will only be emitted |
| if the uninitialised values are later used. |
| |
| <p>One final twist to this story. The above scheme allows garbage to |
| pass through the CPU's integer registers without complaint. It does |
| this by giving the integer registers V tags, passing these around in |
| the expected way. This complicated and computationally expensive to |
| do, but is necessary. Valgrind is more simplistic about |
| floating-point loads and stores. In particular, V bits for data read |
| as a result of floating-point loads are checked at the load |
| instruction. So if your program uses the floating-point registers to |
| do memory-to-memory copies, you will get complaints about |
| uninitialised values. Fortunately, I have not yet encountered a |
| program which (ab)uses the floating-point registers in this way. |
| |
| <a name="vaddress"></a> |
| <h3>3.2 Valid-address (A) bits</h3> |
| |
| Notice that the previous section describes how the validity of values |
| is established and maintained without having to say whether the |
| program does or does not have the right to access any particular |
| memory location. We now consider the latter issue. |
| |
| <p>As described above, every bit in memory or in the CPU has an |
| associated valid-value (V) bit. In addition, all bytes in memory, but |
| not in the CPU, have an associated valid-address (A) bit. This |
| indicates whether or not the program can legitimately read or write |
| that location. It does not give any indication of the validity or the |
| data at that location -- that's the job of the V bits -- only whether |
| or not the location may be accessed. |
| |
| <p>Every time your program reads or writes memory, Valgrind checks the |
| A bits associated with the address. If any of them indicate an |
| invalid address, an error is emitted. Note that the reads and writes |
| themselves do not change the A bits, only consult them. |
| |
| <p>So how do the A bits get set/cleared? Like this: |
| |
| <ul> |
| <li>When the program starts, all the global data areas are marked as |
| accessible.</li><br> |
| <p> |
| |
| <li>When the program does malloc/new, the A bits for the exactly the |
| area allocated, and not a byte more, are marked as accessible. |
| Upon freeing the area the A bits are changed to indicate |
| inaccessibility.</li><br> |
| <p> |
| |
| <li>When the stack pointer register (%esp) moves up or down, A bits |
| are set. The rule is that the area from %esp up to the base of |
| the stack is marked as accessible, and below %esp is |
| inaccessible. (If that sounds illogical, bear in mind that the |
| stack grows down, not up, on almost all Unix systems, including |
| GNU/Linux.) Tracking %esp like this has the useful side-effect |
| that the section of stack used by a function for local variables |
| etc is automatically marked accessible on function entry and |
| inaccessible on exit.</li><br> |
| <p> |
| |
| <li>When doing system calls, A bits are changed appropriately. For |
| example, mmap() magically makes files appear in the process's |
| address space, so the A bits must be updated if mmap() |
| succeeds.</li><br> |
| </ul> |
| |
| |
| <a name="together"></a> |
| <h3>3.3 Putting it all together</h3> |
| Valgrind's checking machinery can be summarised as follows: |
| |
| <ul> |
| <li>Each byte in memory has 8 associated V (valid-value) bits, |
| saying whether or not the byte has a defined value, and a single |
| A (valid-address) bit, saying whether or not the program |
| currently has the right to read/write that address.</li><br> |
| <p> |
| |
| <li>When memory is read or written, the relevant A bits are |
| consulted. If they indicate an invalid address, Valgrind emits |
| an Invalid read or Invalid write error.</li><br> |
| <p> |
| |
| <li>When memory is read into the CPU's integer registers, the |
| relevant V bits are fetched from memory and stored in the |
| simulated CPU. They are not consulted.</li><br> |
| <p> |
| |
| <li>When an integer register is written out to memory, the V bits |
| for that register are written back to memory too.</li><br> |
| <p> |
| |
| <li>When memory is read into the CPU's floating point registers, the |
| relevant V bits are read from memory and they are immediately |
| checked. If any are invalid, an uninitialised value error is |
| emitted. This precludes using the floating-point registers to |
| copy possibly-uninitialised memory, but simplifies Valgrind in |
| that it does not have to track the validity status of the |
| floating-point registers.</li><br> |
| <p> |
| |
| <li>As a result, when a floating-point register is written to |
| memory, the associated V bits are set to indicate a valid |
| value.</li><br> |
| <p> |
| |
| <li>When values in integer CPU registers are used to generate a |
| memory address, or to determine the outcome of a conditional |
| branch, the V bits for those values are checked, and an error |
| emitted if any of them are undefined.</li><br> |
| <p> |
| |
| <li>When values in integer CPU registers are used for any other |
| purpose, Valgrind computes the V bits for the result, but does |
| not check them.</li><br> |
| <p> |
| |
| <li>One the V bits for a value in the CPU have been checked, they |
| are then set to indicate validity. This avoids long chains of |
| errors.</li><br> |
| <p> |
| |
| <li>When values are loaded from memory, valgrind checks the A bits |
| for that location and issues an illegal-address warning if |
| needed. In that case, the V bits loaded are forced to indicate |
| Valid, despite the location being invalid. |
| <p> |
| This apparently strange choice reduces the amount of confusing |
| information presented to the user. It avoids the |
| unpleasant phenomenon in which memory is read from a place which |
| is both unaddressible and contains invalid values, and, as a |
| result, you get not only an invalid-address (read/write) error, |
| but also a potentially large set of uninitialised-value errors, |
| one for every time the value is used. |
| <p> |
| There is a hazy boundary case to do with multi-byte loads from |
| addresses which are partially valid and partially invalid. See |
| details of the flag <code>--partial-loads-ok</code> for details. |
| </li><br> |
| </ul> |
| |
| Valgrind intercepts calls to malloc, calloc, realloc, valloc, |
| memalign, free, new and delete. The behaviour you get is: |
| |
| <ul> |
| |
| <li>malloc/new: the returned memory is marked as addressible but not |
| having valid values. This means you have to write on it before |
| you can read it.</li><br> |
| <p> |
| |
| <li>calloc: returned memory is marked both addressible and valid, |
| since calloc() clears the area to zero.</li><br> |
| <p> |
| |
| <li>realloc: if the new size is larger than the old, the new section |
| is addressible but invalid, as with malloc.</li><br> |
| <p> |
| |
| <li>If the new size is smaller, the dropped-off section is marked as |
| unaddressible. You may only pass to realloc a pointer |
| previously issued to you by malloc/calloc/new/realloc.</li><br> |
| <p> |
| |
| <li>free/delete: you may only pass to free a pointer previously |
| issued to you by malloc/calloc/new/realloc, or the value |
| NULL. Otherwise, Valgrind complains. If the pointer is indeed |
| valid, Valgrind marks the entire area it points at as |
| unaddressible, and places the block in the freed-blocks-queue. |
| The aim is to defer as long as possible reallocation of this |
| block. Until that happens, all attempts to access it will |
| elicit an invalid-address error, as you would hope.</li><br> |
| </ul> |
| |
| |
| |
| <a name="signals"></a> |
| <h3>3.4 Signals</h3> |
| |
| Valgrind provides suitable handling of signals, so, provided you stick |
| to POSIX stuff, you should be ok. Basic sigaction() and sigprocmask() |
| are handled. Signal handlers may return in the normal way or do |
| longjmp(); both should work ok. As specified by POSIX, a signal is |
| blocked in its own handler. Default actions for signals should work |
| as before. Etc, etc. |
| |
| <p>Under the hood, dealing with signals is a real pain, and Valgrind's |
| simulation leaves much to be desired. If your program does |
| way-strange stuff with signals, bad things may happen. If so, let me |
| know. I don't promise to fix it, but I'd at least like to be aware of |
| it. |
| |
| |
| <a name="leaks"><a/> |
| <h3>3.5 Memory leak detection</h3> |
| |
| Valgrind keeps track of all memory blocks issued in response to calls |
| to malloc/calloc/realloc/new. So when the program exits, it knows |
| which blocks are still outstanding -- have not been returned, in other |
| words. Ideally, you want your program to have no blocks still in use |
| at exit. But many programs do. |
| |
| <p>For each such block, Valgrind scans the entire address space of the |
| process, looking for pointers to the block. One of three situations |
| may result: |
| |
| <ul> |
| <li>A pointer to the start of the block is found. This usually |
| indicates programming sloppiness; since the block is still |
| pointed at, the programmer could, at least in principle, free'd |
| it before program exit.</li><br> |
| <p> |
| |
| <li>A pointer to the interior of the block is found. The pointer |
| might originally have pointed to the start and have been moved |
| along, or it might be entirely unrelated. Valgrind deems such a |
| block as "dubious", that is, possibly leaked, |
| because it's unclear whether or |
| not a pointer to it still exists.</li><br> |
| <p> |
| |
| <li>The worst outcome is that no pointer to the block can be found. |
| The block is classified as "leaked", because the |
| programmer could not possibly have free'd it at program exit, |
| since no pointer to it exists. This might be a symptom of |
| having lost the pointer at some earlier point in the |
| program.</li> |
| </ul> |
| |
| Valgrind reports summaries about leaked and dubious blocks. |
| For each such block, it will also tell you where the block was |
| allocated. This should help you figure out why the pointer to it has |
| been lost. In general, you should attempt to ensure your programs do |
| not have any leaked or dubious blocks at exit. |
| |
| <p>The precise area of memory in which Valgrind searches for pointers |
| is: all naturally-aligned 4-byte words for which all A bits indicate |
| addressibility and all V bits indicated that the stored value is |
| actually valid. |
| |
| <p><hr width="100%"> |
| |
| |
| <a name="limits"></a> |
| <h2>4 Limitations</h2> |
| |
| The following list of limitations seems depressingly long. However, |
| most programs actually work fine. |
| |
| <p>Valgrind will run x86-GNU/Linux ELF dynamically linked binaries, on |
| a kernel 2.4.X system, subject to the following constraints: |
| |
| <ul> |
| <li>No MMX, SSE, SSE2, 3DNow instructions. If the translator |
| encounters these, Valgrind will simply give up. It may be |
| possible to add support for them at a later time. Intel added a |
| few instructions such as "cmov" to the integer instruction set |
| on Pentium and later processors, and these are supported. |
| Nevertheless it's safest to think of Valgrind as implementing |
| the 486 instruction set.</li><br> |
| <p> |
| |
| <li>Multithreaded programs are not supported, since I haven't yet |
| figured out how to do this. To be more specific, it is the |
| "clone" system call which is not supported. A program calls |
| "clone" to create threads. Valgrind will abort if this |
| happens.</li><nr> |
| <p> |
| |
| <li>Valgrind assumes that the floating point registers are not used |
| as intermediaries in memory-to-memory copies, so it immediately |
| checks V bits in floating-point loads/stores. If you want to |
| write code which copies around possibly-uninitialised values, |
| you must ensure these travel through the integer registers, not |
| the FPU.</li><br> |
| <p> |
| |
| <li>If your program does its own memory management, rather than |
| using malloc/new/free/delete, it should still work, but |
| Valgrind's error checking won't be so effective.</li><br> |
| <p> |
| |
| <li>Valgrind's signal simulation is not as robust as it could be. |
| Basic POSIX-compliant sigaction and sigprocmask functionality is |
| supplied, but it's conceivable that things could go badly awry |
| if you do wierd things with signals. Workaround: don't. |
| Programs that do non-POSIX signal tricks are in any case |
| inherently unportable, so should be avoided if |
| possible.</li><br> |
| <p> |
| |
| <li>I have no idea what happens if programs try to handle signals on |
| an alternate stack (sigaltstack). YMMV.</li><br> |
| <p> |
| |
| <li>Programs which switch stacks are not well handled. Valgrind |
| does have support for this, but I don't have great faith in it. |
| It's difficult -- there's no cast-iron way to decide whether a |
| large change in %esp is as a result of the program switching |
| stacks, or merely allocating a large object temporarily on the |
| current stack -- yet Valgrind needs to handle the two situations |
| differently.</li><br> |
| <p> |
| |
| <li>x86 instructions, and system calls, have been implemented on |
| demand. So it's possible, although unlikely, that a program |
| will fall over with a message to that effect. If this happens, |
| please mail me ALL the details printed out, so I can try and |
| implement the missing feature.</li><br> |
| <p> |
| |
| <li>x86 floating point works correctly, but floating-point code may |
| run even more slowly than integer code, due to my simplistic |
| approach to FPU emulation.</li><br> |
| <p> |
| |
| <li>You can't Valgrind-ize statically linked binaries. Valgrind |
| relies on the dynamic-link mechanism to gain control at |
| startup.</li><br> |
| <p> |
| |
| <li>Memory consumption of your program is majorly increased whilst |
| running under Valgrind. This is due to the large amount of |
| adminstrative information maintained behind the scenes. Another |
| cause is that Valgrind dynamically translates the original |
| executable and never throws any translation away, except in |
| those rare cases where self-modifying code is detected. |
| Translated, instrumented code is 8-12 times larger than the |
| original (!) so you can easily end up with 15+ MB of |
| translations when running (eg) a web browser. There's not a lot |
| you can do about this -- use Valgrind on a fast machine with a lot |
| of memory and swap space. At some point I may implement a LRU |
| caching scheme for translations, so as to bound the maximum |
| amount of memory devoted to them, to say 8 or 16 MB.</li> |
| </ul> |
| |
| |
| Programs which are known not to work are: |
| |
| <ul> |
| <li>Netscape 4.76 works pretty well on some platforms -- quite |
| nicely on my AMD K6-III (400 MHz). I can surf, do mail, etc, no |
| problem. On other platforms is has been observed to crash |
| during startup. Despite much investigation I can't figure out |
| why.</li><br> |
| <p> |
| |
| <li>kpackage (a KDE front end to rpm) dies because the CPUID |
| instruction is unimplemented. Easy to fix.</li><br> |
| <p> |
| |
| <li>knode (a KDE newsreader) tries to do multithreaded things, and |
| fails.</li><br> |
| <p> |
| |
| <li>emacs starts up but immediately concludes it is out of memory |
| and aborts. Emacs has it's own memory-management scheme, but I |
| don't understand why this should interact so badly with |
| Valgrind.</li><br> |
| <p> |
| |
| <li>Gimp and Gnome and GTK-based apps die early on because |
| of unimplemented system call wrappers. (I'm a KDE user :) |
| This wouldn't be hard to fix. |
| </li><br> |
| <p> |
| |
| <li>As a consequence of me being a KDE user, almost all KDE apps |
| work ok -- except those which are multithreaded. |
| </li><br> |
| <p> |
| </ul> |
| |
| |
| <p><hr width="100%"> |
| |
| |
| <a name="howitworks"></a> |
| <h2>5 How it works -- a rough overview</h2> |
| Some gory details, for those with a passion for gory details. You |
| don't need to read this section if all you want to do is use Valgrind. |
| |
| <a name="startb"></a> |
| <h3>5.1 Getting started</h3> |
| |
| Valgrind is compiled into a shared object, valgrind.so. The shell |
| script valgrind sets the LD_PRELOAD environment variable to point to |
| valgrind.so. This causes the .so to be loaded as an extra library to |
| any subsequently executed dynamically-linked ELF binary, viz, the |
| program you want to debug. |
| |
| <p>The dynamic linker allows each .so in the process image to have an |
| initialisation function which is run before main(). It also allows |
| each .so to have a finalisation function run after main() exits. |
| |
| <p>When valgrind.so's initialisation function is called by the dynamic |
| linker, the synthetic CPU to starts up. The real CPU remains locked |
| in valgrind.so for the entire rest of the program, but the synthetic |
| CPU returns from the initialisation function. Startup of the program |
| now continues as usual -- the dynamic linker calls all the other .so's |
| initialisation routines, and eventually runs main(). This all runs on |
| the synthetic CPU, not the real one, but the client program cannot |
| tell the difference. |
| |
| <p>Eventually main() exits, so the synthetic CPU calls valgrind.so's |
| finalisation function. Valgrind detects this, and uses it as its cue |
| to exit. It prints summaries of all errors detected, possibly checks |
| for memory leaks, and then exits the finalisation routine, but now on |
| the real CPU. The synthetic CPU has now lost control -- permanently |
| -- so the program exits back to the OS on the real CPU, just as it |
| would have done anyway. |
| |
| <p>On entry, Valgrind switches stacks, so it runs on its own stack. |
| On exit, it switches back. This means that the client program |
| continues to run on its own stack, so we can switch back and forth |
| between running it on the simulated and real CPUs without difficulty. |
| This was an important design decision, because it makes it easy (well, |
| significantly less difficult) to debug the synthetic CPU. |
| |
| |
| <a name="engine"></a> |
| <h3>5.2 The translation/instrumentation engine</h3> |
| |
| Valgrind does not directly run any of the original program's code. Only |
| instrumented translations are run. Valgrind maintains a translation |
| table, which allows it to find the translation quickly for any branch |
| target (code address). If no translation has yet been made, the |
| translator - a just-in-time translator - is summoned. This makes an |
| instrumented translation, which is added to the collection of |
| translations. Subsequent jumps to that address will use this |
| translation. |
| |
| <p>Valgrind can optionally check writes made by the application, to |
| see if they are writing an address contained within code which has |
| been translated. Such a write invalidates translations of code |
| bracketing the written address. Valgrind will discard the relevant |
| translations, which causes them to be re-made, if they are needed |
| again, reflecting the new updated data stored there. In this way, |
| self modifying code is supported. In practice I have not found any |
| Linux applications which use self-modifying-code. |
| |
| <p>The JITter translates basic blocks -- blocks of straight-line-code |
| -- as single entities. To minimise the considerable difficulties of |
| dealing with the x86 instruction set, x86 instructions are first |
| translated to a RISC-like intermediate code, similar to sparc code, |
| but with an infinite number of virtual integer registers. Initially |
| each insn is translated seperately, and there is no attempt at |
| instrumentation. |
| |
| <p>The intermediate code is improved, mostly so as to try and cache |
| the simulated machine's registers in the real machine's registers over |
| several simulated instructions. This is often very effective. Also, |
| we try to remove redundant updates of the simulated machines's |
| condition-code register. |
| |
| <p>The intermediate code is then instrumented, giving more |
| intermediate code. There are a few extra intermediate-code operations |
| to support instrumentation; it is all refreshingly simple. After |
| instrumentation there is a cleanup pass to remove redundant value |
| checks. |
| |
| <p>This gives instrumented intermediate code which mentions arbitrary |
| numbers of virtual registers. A linear-scan register allocator is |
| used to assign real registers and possibly generate spill code. All |
| of this is still phrased in terms of the intermediate code. This |
| machinery is inspired by the work of Reuben Thomas (MITE). |
| |
| <p>Then, and only then, is the final x86 code emitted. The |
| intermediate code is carefully designed so that x86 code can be |
| generated from it without need for spare registers or other |
| inconveniences. |
| |
| <p>The translations are managed using a traditional LRU-based caching |
| scheme. The translation cache has a default size of about 14MB. |
| |
| <a name="track"></a> |
| |
| <h3>5.3 Tracking the status of memory</h3> Each byte in the |
| process' address space has nine bits associated with it: one A bit and |
| eight V bits. The A and V bits for each byte are stored using a |
| sparse array, which flexibly and efficiently covers arbitrary parts of |
| the 32-bit address space without imposing significant space or |
| performance overheads for the parts of the address space never |
| visited. The scheme used, and speedup hacks, are described in detail |
| at the top of the source file vg_memory.c, so you should read that for |
| the gory details. |
| |
| <a name="sys_calls"></a> |
| |
| <h3>5.4 System calls</h3> |
| All system calls are intercepted. The memory status map is consulted |
| before and updated after each call. It's all rather tiresome. See |
| vg_syscall_mem.c for details. |
| |
| <a name="sys_signals"></a> |
| |
| <h3>5.5 Signals</h3> |
| All system calls to sigaction() and sigprocmask() are intercepted. If |
| the client program is trying to set a signal handler, Valgrind makes a |
| note of the handler address and which signal it is for. Valgrind then |
| arranges for the same signal to be delivered to its own handler. |
| |
| <p>When such a signal arrives, Valgrind's own handler catches it, and |
| notes the fact. At a convenient safe point in execution, Valgrind |
| builds a signal delivery frame on the client's stack and runs its |
| handler. If the handler longjmp()s, there is nothing more to be said. |
| If the handler returns, Valgrind notices this, zaps the delivery |
| frame, and carries on where it left off before delivering the signal. |
| |
| <p>The purpose of this nonsense is that setting signal handlers |
| essentially amounts to giving callback addresses to the Linux kernel. |
| We can't allow this to happen, because if it did, signal handlers |
| would run on the real CPU, not the simulated one. This means the |
| checking machinery would not operate during the handler run, and, |
| worse, memory permissions maps would not be updated, which could cause |
| spurious error reports once the handler had returned. |
| |
| <p>An even worse thing would happen if the signal handler longjmp'd |
| rather than returned: Valgrind would completely lose control of the |
| client program. |
| |
| <p>Upshot: we can't allow the client to install signal handlers |
| directly. Instead, Valgrind must catch, on behalf of the client, any |
| signal the client asks to catch, and must delivery it to the client on |
| the simulated CPU, not the real one. This involves considerable |
| gruesome fakery; see vg_signals.c for details. |
| <p> |
| |
| <hr width="100%"> |
| |
| <a name="example"></a> |
| <h2>6 Example</h2> |
| This is the log for a run of a small program. The program is in fact |
| correct, and the reported error is as the result of a potentially serious |
| code generation bug in GNU g++ (snapshot 20010527). |
| <pre> |
| sewardj@phoenix:~/newmat10$ |
| ~/Valgrind-6/valgrind -v ./bogon |
| ==25832== Valgrind 0.10, a memory error detector for x86 RedHat 7.1. |
| ==25832== Copyright (C) 2000-2001, and GNU GPL'd, by Julian Seward. |
| ==25832== Startup, with flags: |
| ==25832== --suppressions=/home/sewardj/Valgrind/redhat71.supp |
| ==25832== reading syms from /lib/ld-linux.so.2 |
| ==25832== reading syms from /lib/libc.so.6 |
| ==25832== reading syms from /mnt/pima/jrs/Inst/lib/libgcc_s.so.0 |
| ==25832== reading syms from /lib/libm.so.6 |
| ==25832== reading syms from /mnt/pima/jrs/Inst/lib/libstdc++.so.3 |
| ==25832== reading syms from /home/sewardj/Valgrind/valgrind.so |
| ==25832== reading syms from /proc/self/exe |
| ==25832== loaded 5950 symbols, 142333 line number locations |
| ==25832== |
| ==25832== Invalid read of size 4 |
| ==25832== at 0x8048724: _ZN10BandMatrix6ReSizeEiii (bogon.cpp:45) |
| ==25832== by 0x80487AF: main (bogon.cpp:66) |
| ==25832== by 0x40371E5E: __libc_start_main (libc-start.c:129) |
| ==25832== by 0x80485D1: (within /home/sewardj/newmat10/bogon) |
| ==25832== Address 0xBFFFF74C is not stack'd, malloc'd or free'd |
| ==25832== |
| ==25832== ERROR SUMMARY: 1 errors from 1 contexts (suppressed: 0 from 0) |
| ==25832== malloc/free: in use at exit: 0 bytes in 0 blocks. |
| ==25832== malloc/free: 0 allocs, 0 frees, 0 bytes allocated. |
| ==25832== For a detailed leak analysis, rerun with: --leak-check=yes |
| ==25832== |
| ==25832== exiting, did 1881 basic blocks, 0 misses. |
| ==25832== 223 translations, 3626 bytes in, 56801 bytes out. |
| </pre> |
| <p>The GCC folks fixed this about a week before gcc-3.0 shipped. |
| <hr width="100%"> |
| <p> |
| </body> |
| </html> |