| <?xml version="1.0"?> <!-- -*- sgml -*- --> |
| <!DOCTYPE chapter PUBLIC "-//OASIS//DTD DocBook XML V4.2//EN" |
| "http://www.oasis-open.org/docbook/xml/4.2/docbookx.dtd" |
| [ <!ENTITY % vg-entities SYSTEM "../../docs/xml/vg-entities.xml"> %vg-entities; ]> |
| |
| |
| <chapter id="ms-manual" xreflabel="Massif: a heap profiler"> |
| <title>Massif: a heap profiler</title> |
| |
| <para>To use this tool, you must specify |
| <option>--tool=massif</option> on the Valgrind |
| command line.</para> |
| |
| <sect1 id="ms-manual.overview" xreflabel="Overview"> |
| <title>Overview</title> |
| |
| <para>Massif is a heap profiler. It measures how much heap memory your |
| program uses. This includes both the useful space, and the extra bytes |
| allocated for book-keeping and alignment purposes. It can also |
| measure the size of your program's stack(s), although it does not do so by |
| default.</para> |
| |
| <para>Heap profiling can help you reduce the amount of memory your program |
| uses. On modern machines with virtual memory, this provides the following |
| benefits:</para> |
| |
| <itemizedlist> |
| <listitem><para>It can speed up your program -- a smaller |
| program will interact better with your machine's caches and |
| avoid paging.</para></listitem> |
| |
| <listitem><para>If your program uses lots of memory, it will |
| reduce the chance that it exhausts your machine's swap |
| space.</para></listitem> |
| </itemizedlist> |
| |
| <para>Also, there are certain space leaks that aren't detected by |
| traditional leak-checkers, such as Memcheck's. That's because |
| the memory isn't ever actually lost -- a pointer remains to it -- |
| but it's not in use. Programs that have leaks like this can |
| unnecessarily increase the amount of memory they are using over |
| time. Massif can help identify these leaks.</para> |
| |
| <para>Importantly, Massif tells you not only how much heap memory your |
| program is using, it also gives very detailed information that indicates |
| which parts of your program are responsible for allocating the heap memory. |
| </para> |
| |
| </sect1> |
| |
| |
| <sect1 id="ms-manual.using" xreflabel="Using Massif and ms_print"> |
| <title>Using Massif and ms_print</title> |
| |
| <para>First off, as for the other Valgrind tools, you should compile with |
| debugging info (the <option>-g</option> option). It shouldn't |
| matter much what optimisation level you compile your program with, as this |
| is unlikely to affect the heap memory usage.</para> |
| |
| <para>Then, you need to run Massif itself to gather the profiling |
| information, and then run ms_print to present it in a readable way.</para> |
| |
| |
| |
| |
| <sect2 id="ms-manual.anexample" xreflabel="An Example"> |
| <title>An Example Program</title> |
| |
| <para>An example will make things clear. Consider the following C program |
| (annotated with line numbers) which allocates a number of different blocks |
| on the heap.</para> |
| |
| <screen><![CDATA[ |
| 1 #include <stdlib.h> |
| 2 |
| 3 void g(void) |
| 4 { |
| 5 malloc(4000); |
| 6 } |
| 7 |
| 8 void f(void) |
| 9 { |
| 10 malloc(2000); |
| 11 g(); |
| 12 } |
| 13 |
| 14 int main(void) |
| 15 { |
| 16 int i; |
| 17 int* a[10]; |
| 18 |
| 19 for (i = 0; i < 10; i++) { |
| 20 a[i] = malloc(1000); |
| 21 } |
| 22 |
| 23 f(); |
| 24 |
| 25 g(); |
| 26 |
| 27 for (i = 0; i < 10; i++) { |
| 28 free(a[i]); |
| 29 } |
| 30 |
| 31 return 0; |
| 32 } |
| ]]></screen> |
| |
| </sect2> |
| |
| |
| <sect2 id="ms-manual.running-massif" xreflabel="Running Massif"> |
| <title>Running Massif</title> |
| |
| <para>To gather heap profiling information about the program |
| <computeroutput>prog</computeroutput>, type:</para> |
| <screen><![CDATA[ |
| valgrind --tool=massif prog |
| ]]></screen> |
| |
| <para>The program will execute (slowly). Upon completion, no summary |
| statistics are printed to Valgrind's commentary; all of Massif's profiling |
| data is written to a file. By default, this file is called |
| <filename>massif.out.<pid></filename>, where |
| <filename><pid></filename> is the process ID, although this filename |
| can be changed with the <option>--massif-out-file</option> option.</para> |
| |
| </sect2> |
| |
| |
| <sect2 id="ms-manual.running-ms_print" xreflabel="Running ms_print"> |
| <title>Running ms_print</title> |
| |
| <para>To see the information gathered by Massif in an easy-to-read form, use |
| ms_print. If the output file's name is |
| <filename>massif.out.12345</filename>, type:</para> |
| <screen><![CDATA[ |
| ms_print massif.out.12345]]></screen> |
| |
| <para>ms_print will produce (a) a graph showing the memory consumption over |
| the program's execution, and (b) detailed information about the responsible |
| allocation sites at various points in the program, including the point of |
| peak memory allocation. The use of a separate script for presenting the |
| results is deliberate: it separates the data gathering from its |
| presentation, and means that new methods of presenting the data can be added in |
| the future.</para> |
| |
| </sect2> |
| |
| |
| <sect2 id="ms-manual.theoutputpreamble" xreflabel="The Output Preamble"> |
| <title>The Output Preamble</title> |
| |
| <para>After running this program under Massif, the first part of ms_print's |
| output contains a preamble which just states how the program, Massif and |
| ms_print were each invoked:</para> |
| |
| <screen><![CDATA[ |
| -------------------------------------------------------------------------------- |
| Command: example |
| Massif arguments: (none) |
| ms_print arguments: massif.out.12797 |
| -------------------------------------------------------------------------------- |
| ]]></screen> |
| |
| </sect2> |
| |
| |
| <sect2 id="ms-manual.theoutputgraph" xreflabel="The Output Graph"> |
| <title>The Output Graph</title> |
| |
| <para>The next part is the graph that shows how memory consumption occurred |
| as the program executed:</para> |
| |
| <screen><![CDATA[ |
| KB |
| 19.63^ # |
| | # |
| | # |
| | # |
| | # |
| | # |
| | # |
| | # |
| | # |
| | # |
| | # |
| | # |
| | # |
| | # |
| | # |
| | # |
| | # |
| | :# |
| | :# |
| | :# |
| 0 +----------------------------------------------------------------------->ki 0 113.4 |
| |
| |
| Number of snapshots: 25 |
| Detailed snapshots: [9, 14 (peak), 24] |
| ]]></screen> |
| |
| <para>Why is most of the graph empty, with only a couple of bars at the very |
| end? By default, Massif uses "instructions executed" as the unit of time. |
| For very short-run programs such as the example, most of the executed |
| instructions involve the loading and dynamic linking of the program. The |
| execution of <computeroutput>main</computeroutput> (and thus the heap |
| allocations) only occur at the very end. For a short-running program like |
| this, we can use the <option>--time-unit=B</option> option |
| to specify that we want the time unit to instead be the number of bytes |
| allocated/deallocated on the heap and stack(s).</para> |
| |
| <para>If we re-run the program under Massif with this option, and then |
| re-run ms_print, we get this more useful graph:</para> |
| |
| <screen><![CDATA[ |
| 19.63^ ### |
| | # |
| | # :: |
| | # : ::: |
| | :::::::::# : : :: |
| | : # : : : :: |
| | : # : : : : ::: |
| | : # : : : : : :: |
| | ::::::::::: # : : : : : : ::: |
| | : : # : : : : : : : :: |
| | ::::: : # : : : : : : : : :: |
| | @@@: : : # : : : : : : : : : @ |
| | ::@ : : : # : : : : : : : : : @ |
| | :::: @ : : : # : : : : : : : : : @ |
| | ::: : @ : : : # : : : : : : : : : @ |
| | ::: : : @ : : : # : : : : : : : : : @ |
| | :::: : : : @ : : : # : : : : : : : : : @ |
| | ::: : : : : @ : : : # : : : : : : : : : @ |
| | :::: : : : : : @ : : : # : : : : : : : : : @ |
| | ::: : : : : : : @ : : : # : : : : : : : : : @ |
| 0 +----------------------------------------------------------------------->KB 0 29.48 |
| |
| Number of snapshots: 25 |
| Detailed snapshots: [9, 14 (peak), 24] |
| ]]></screen> |
| |
| <para>The size of the graph can be changed with ms_print's |
| <option>--x</option> and <option>--y</option> options. Each vertical bar |
| represents a snapshot, i.e. a measurement of the memory usage at a certain |
| point in time. If the next snapshot is more than one column away, a |
| horizontal line of characters is drawn from the top of the snapshot to just |
| before the next snapshot column. The text at the bottom show that 25 |
| snapshots were taken for this program, which is one per heap |
| allocation/deallocation, plus a couple of extras. Massif starts by taking |
| snapshots for every heap allocation/deallocation, but as a program runs for |
| longer, it takes snapshots less frequently. It also discards older |
| snapshots as the program goes on; when it reaches the maximum number of |
| snapshots (100 by default, although changeable with the |
| <option>--max-snapshots</option> option) half of them are |
| deleted. This means that a reasonable number of snapshots are always |
| maintained.</para> |
| |
| <para>Most snapshots are <emphasis>normal</emphasis>, and only basic |
| information is recorded for them. Normal snapshots are represented in the |
| graph by bars consisting of ':' characters.</para> |
| |
| <para>Some snapshots are <emphasis>detailed</emphasis>. Information about |
| where allocations happened are recorded for these snapshots, as we will see |
| shortly. Detailed snapshots are represented in the graph by bars consisting |
| of '@' characters. The text at the bottom show that 3 detailed |
| snapshots were taken for this program (snapshots 9, 14 and 24). By default, |
| every 10th snapshot is detailed, although this can be changed via the |
| <option>--detailed-freq</option> option.</para> |
| |
| <para>Finally, there is at most one <emphasis>peak</emphasis> snapshot. The |
| peak snapshot is a detailed snapshot, and records the point where memory |
| consumption was greatest. The peak snapshot is represented in the graph by |
| a bar consisting of '#' characters. The text at the bottom shows |
| that snapshot 14 was the peak.</para> |
| |
| <para>Massif's determination of when the peak occurred can be wrong, for |
| two reasons.</para> |
| |
| <itemizedlist> |
| <listitem><para>Peak snapshots are only ever taken after a deallocation |
| happens. This avoids lots of unnecessary peak snapshot recordings |
| (imagine what happens if your program allocates a lot of heap blocks in |
| succession, hitting a new peak every time). But it means that if your |
| program never deallocates any blocks, no peak will be recorded. It also |
| means that if your program does deallocate blocks but later allocates to a |
| higher peak without subsequently deallocating, the reported peak will be |
| too low. |
| </para> |
| </listitem> |
| |
| <listitem><para>Even with this behaviour, recording the peak accurately |
| is slow. So by default Massif records a peak whose size is within 1% of |
| the size of the true peak. This inaccuracy in the peak measurement can be |
| changed with the <option>--peak-inaccuracy</option> option.</para> |
| </listitem> |
| </itemizedlist> |
| |
| <para>The following graph is from an execution of Konqueror, the KDE web |
| browser. It shows what graphs for larger programs look like.</para> |
| <screen><![CDATA[ |
| MB |
| 3.952^ # |
| | @#: |
| | :@@#: |
| | @@::::@@#: |
| | @ :: :@@#:: |
| | @@@ :: :@@#:: |
| | @@:@@@ :: :@@#:: |
| | :::@ :@@@ :: :@@#:: |
| | : :@ :@@@ :: :@@#:: |
| | :@: :@ :@@@ :: :@@#:: |
| | @@:@: :@ :@@@ :: :@@#::: |
| | : :: ::@@:@: :@ :@@@ :: :@@#::: |
| | :@@: ::::: ::::@@@:::@@:@: :@ :@@@ :: :@@#::: |
| | ::::@@: ::: ::::::: @ :::@@:@: :@ :@@@ :: :@@#::: |
| | @: ::@@: ::: ::::::: @ :::@@:@: :@ :@@@ :: :@@#::: |
| | @: ::@@: ::: ::::::: @ :::@@:@: :@ :@@@ :: :@@#::: |
| | @: ::@@:::::: ::::::: @ :::@@:@: :@ :@@@ :: :@@#::: |
| | ::@@@: ::@@:: ::: ::::::: @ :::@@:@: :@ :@@@ :: :@@#::: |
| | :::::@ @: ::@@:: ::: ::::::: @ :::@@:@: :@ :@@@ :: :@@#::: |
| | @@:::::@ @: ::@@:: ::: ::::::: @ :::@@:@: :@ :@@@ :: :@@#::: |
| 0 +----------------------------------------------------------------------->Mi |
| 0 626.4 |
| |
| Number of snapshots: 63 |
| Detailed snapshots: [3, 4, 10, 11, 15, 16, 29, 33, 34, 36, 39, 41, |
| 42, 43, 44, 49, 50, 51, 53, 55, 56, 57 (peak)] |
| ]]></screen> |
| |
| <para>Note that the larger size units are KB, MB, GB, etc. As is typical |
| for memory measurements, these are based on a multiplier of 1024, rather |
| than the standard SI multiplier of 1000. Strictly speaking, they should be |
| written KiB, MiB, GiB, etc.</para> |
| |
| </sect2> |
| |
| |
| <sect2 id="ms-manual.thesnapshotdetails" xreflabel="The Snapshot Details"> |
| <title>The Snapshot Details</title> |
| |
| <para>Returning to our example, the graph is followed by the detailed |
| information for each snapshot. The first nine snapshots are normal, so only |
| a small amount of information is recorded for each one:</para> |
| <screen><![CDATA[ |
| -------------------------------------------------------------------------------- |
| n time(B) total(B) useful-heap(B) extra-heap(B) stacks(B) |
| -------------------------------------------------------------------------------- |
| 0 0 0 0 0 0 |
| 1 1,008 1,008 1,000 8 0 |
| 2 2,016 2,016 2,000 16 0 |
| 3 3,024 3,024 3,000 24 0 |
| 4 4,032 4,032 4,000 32 0 |
| 5 5,040 5,040 5,000 40 0 |
| 6 6,048 6,048 6,000 48 0 |
| 7 7,056 7,056 7,000 56 0 |
| 8 8,064 8,064 8,000 64 0 |
| ]]></screen> |
| |
| <para>Each normal snapshot records several things.</para> |
| |
| <itemizedlist> |
| <listitem><para>Its number.</para></listitem> |
| |
| <listitem><para>The time it was taken. In this case, the time unit is |
| bytes, due to the use of |
| <option>--time-unit=B</option>.</para></listitem> |
| |
| <listitem><para>The total memory consumption at that point.</para></listitem> |
| |
| <listitem><para>The number of useful heap bytes allocated at that point. |
| This reflects the number of bytes asked for by the |
| program.</para></listitem> |
| |
| <listitem><para>The number of extra heap bytes allocated at that point. |
| This reflects the number of bytes allocated in excess of what the program |
| asked for. There are two sources of extra heap bytes.</para> |
| |
| <para>First, every heap block has administrative bytes associated with it. |
| The exact number of administrative bytes depends on the details of the |
| allocator. By default Massif assumes 8 bytes per block, as can be seen |
| from the example, but this number can be changed via the |
| <option>--heap-admin</option> option.</para> |
| |
| <para>Second, allocators often round up the number of bytes asked for to a |
| larger number, usually 8 or 16. This is required to ensure that elements |
| within the block are suitably aligned. If N bytes are asked for, Massif |
| rounds N up to the nearest multiple of the value specified by the |
| <option><xref linkend="opt.alignment"/></option> option. |
| </para></listitem> |
| |
| <listitem><para>The size of the stack(s). By default, stack profiling is |
| off as it slows Massif down greatly. Therefore, the stack column is zero |
| in the example. Stack profiling can be turned on with the |
| <option>--stacks=yes</option> option. |
| |
| </para></listitem> |
| </itemizedlist> |
| |
| <para>The next snapshot is detailed. As well as the basic counts, it gives |
| an allocation tree which indicates exactly which pieces of code were |
| responsible for allocating heap memory:</para> |
| |
| <screen><![CDATA[ |
| 9 9,072 9,072 9,000 72 0 |
| 99.21% (9,000B) (heap allocation functions) malloc/new/new[], --alloc-fns, etc. |
| ->99.21% (9,000B) 0x804841A: main (example.c:20) |
| ]]></screen> |
| |
| <para>The allocation tree can be read from the top down. The first line |
| indicates all heap allocation functions such as <function>malloc</function> |
| and C++ <function>new</function>. All heap allocations go through these |
| functions, and so all 9,000 useful bytes (which is 99.21% of all allocated |
| bytes) go through them. But how were <function>malloc</function> and new |
| called? At this point, every allocation so far has been due to line 20 |
| inside <function>main</function>, hence the second line in the tree. The |
| <option>-></option> indicates that main (line 20) called |
| <function>malloc</function>.</para> |
| |
| <para>Let's see what the subsequent output shows happened next:</para> |
| |
| <screen><![CDATA[ |
| -------------------------------------------------------------------------------- |
| n time(B) total(B) useful-heap(B) extra-heap(B) stacks(B) |
| -------------------------------------------------------------------------------- |
| 10 10,080 10,080 10,000 80 0 |
| 11 12,088 12,088 12,000 88 0 |
| 12 16,096 16,096 16,000 96 0 |
| 13 20,104 20,104 20,000 104 0 |
| 14 20,104 20,104 20,000 104 0 |
| 99.48% (20,000B) (heap allocation functions) malloc/new/new[], --alloc-fns, etc. |
| ->49.74% (10,000B) 0x804841A: main (example.c:20) |
| | |
| ->39.79% (8,000B) 0x80483C2: g (example.c:5) |
| | ->19.90% (4,000B) 0x80483E2: f (example.c:11) |
| | | ->19.90% (4,000B) 0x8048431: main (example.c:23) |
| | | |
| | ->19.90% (4,000B) 0x8048436: main (example.c:25) |
| | |
| ->09.95% (2,000B) 0x80483DA: f (example.c:10) |
| ->09.95% (2,000B) 0x8048431: main (example.c:23) |
| ]]></screen> |
| |
| <para>The first four snapshots are similar to the previous ones. But then |
| the global allocation peak is reached, and a detailed snapshot (number 14) |
| is taken. Its allocation tree shows that 20,000B of useful heap memory has |
| been allocated, and the lines and arrows indicate that this is from three |
| different code locations: line 20, which is responsible for 10,000B |
| (49.74%); line 5, which is responsible for 8,000B (39.79%); and line 10, |
| which is responsible for 2,000B (9.95%).</para> |
| |
| <para>We can then drill down further in the allocation tree. For example, |
| of the 8,000B asked for by line 5, half of it was due to a call from line |
| 11, and half was due to a call from line 25.</para> |
| |
| <para>In short, Massif collates the stack trace of every single allocation |
| point in the program into a single tree, which gives a complete picture at |
| a particular point in time of how and why all heap memory was |
| allocated.</para> |
| |
| <para>Note that the tree entries correspond not to functions, but to |
| individual code locations. For example, if function <function>A</function> |
| calls <function>malloc</function>, and function <function>B</function> calls |
| <function>A</function> twice, once on line 10 and once on line 11, then |
| the two calls will result in two distinct stack traces in the tree. In |
| contrast, if <function>B</function> calls <function>A</function> repeatedly |
| from line 15 (e.g. due to a loop), then each of those calls will be |
| represented by the same stack trace in the tree.</para> |
| |
| <para>Note also that each tree entry with children in the example satisfies an |
| invariant: the entry's size is equal to the sum of its children's sizes. |
| For example, the first entry has size 20,000B, and its children have sizes |
| 10,000B, 8,000B, and 2,000B. In general, this invariant almost always |
| holds. However, in rare circumstances stack traces can be malformed, in |
| which case a stack trace can be a sub-trace of another stack trace. This |
| means that some entries in the tree may not satisfy the invariant -- the |
| entry's size will be greater than the sum of its children's sizes. This is |
| not a big problem, but could make the results confusing. Massif can |
| sometimes detect when this happens; if it does, it issues a warning:</para> |
| |
| <screen><![CDATA[ |
| Warning: Malformed stack trace detected. In Massif's output, |
| the size of an entry's child entries may not sum up |
| to the entry's size as they normally do. |
| ]]></screen> |
| |
| <para>However, Massif does not detect and warn about every such occurrence. |
| Fortunately, malformed stack traces are rare in practice.</para> |
| |
| <para>Returning now to ms_print's output, the final part is similar:</para> |
| |
| <screen><![CDATA[ |
| -------------------------------------------------------------------------------- |
| n time(B) total(B) useful-heap(B) extra-heap(B) stacks(B) |
| -------------------------------------------------------------------------------- |
| 15 21,112 19,096 19,000 96 0 |
| 16 22,120 18,088 18,000 88 0 |
| 17 23,128 17,080 17,000 80 0 |
| 18 24,136 16,072 16,000 72 0 |
| 19 25,144 15,064 15,000 64 0 |
| 20 26,152 14,056 14,000 56 0 |
| 21 27,160 13,048 13,000 48 0 |
| 22 28,168 12,040 12,000 40 0 |
| 23 29,176 11,032 11,000 32 0 |
| 24 30,184 10,024 10,000 24 0 |
| 99.76% (10,000B) (heap allocation functions) malloc/new/new[], --alloc-fns, etc. |
| ->79.81% (8,000B) 0x80483C2: g (example.c:5) |
| | ->39.90% (4,000B) 0x80483E2: f (example.c:11) |
| | | ->39.90% (4,000B) 0x8048431: main (example.c:23) |
| | | |
| | ->39.90% (4,000B) 0x8048436: main (example.c:25) |
| | |
| ->19.95% (2,000B) 0x80483DA: f (example.c:10) |
| | ->19.95% (2,000B) 0x8048431: main (example.c:23) |
| | |
| ->00.00% (0B) in 1+ places, all below ms_print's threshold (01.00%) |
| ]]></screen> |
| |
| <para>The final detailed snapshot shows how the heap looked at termination. |
| The 00.00% entry represents the code locations for which memory was |
| allocated and then freed (line 20 in this case, the memory for which was |
| freed on line 28). However, no code location details are given for this |
| entry; by default, Massif only records the details for code locations |
| responsible for more than 1% of useful memory bytes, and ms_print likewise |
| only prints the details for code locations responsible for more than 1%. |
| The entries that do not meet this threshold are aggregated. This avoids |
| filling up the output with large numbers of unimportant entries. The |
| thresholds can be changed with the |
| <option>--threshold</option> option that both Massif and |
| ms_print support.</para> |
| |
| </sect2> |
| |
| |
| <sect2 id="ms-manual.forkingprograms" xreflabel="Forking Programs"> |
| <title>Forking Programs</title> |
| <para>If your program forks, the child will inherit all the profiling data that |
| has been gathered for the parent.</para> |
| |
| <para>If the output file format string (controlled by |
| <option>--massif-out-file</option>) does not contain <option>%p</option>, then |
| the outputs from the parent and child will be intermingled in a single output |
| file, which will almost certainly make it unreadable by ms_print.</para> |
| </sect2> |
| |
| |
| <sect2 id="ms-manual.not-measured" |
| xreflabel="Memory Allocations Not Measured by Massif"> |
| <title>Memory Allocations Not Measured by Massif</title> |
| <para> |
| It is worth emphasising that Massif measures only heap memory, i.e. memory |
| allocated with |
| <function>malloc</function>, |
| <function>calloc</function>, |
| <function>realloc</function>, |
| <function>memalign</function>, |
| <function>new</function>, |
| <function>new[]</function>, |
| and a few other, similar functions. (And it can optionally measure stack |
| memory, of course.) This means it does <emphasis>not</emphasis> directly |
| measure memory allocated with lower-level system calls such as |
| <function>mmap</function>, |
| <function>mremap</function>, and |
| <function>brk</function>. |
| </para> |
| |
| <para> |
| Heap allocation functions such as <function>malloc</function> are built on |
| top of these system calls. For example, when needed, an allocator will |
| typically call <function>mmap</function> to allocate a large chunk of |
| memory, and then hand over pieces of that memory chunk to the client program |
| in response to calls to <function>malloc</function> et al. Massif directly |
| measures only these higher-level <function>malloc</function> et al calls, |
| not the lower-level system calls. |
| </para> |
| |
| <para> |
| Furthermore, a client program may use these lower-level system calls |
| directly to allocate memory. Massif does not measure these. Nor does it |
| measure the size of code, data and BSS segments. Therefore, the numbers |
| reported by Massif may be significantly smaller than those reported by tools |
| such as <filename>top</filename> that measure a program's total size in |
| memory. |
| </para> |
| |
| </sect2> |
| |
| |
| <sect2 id="ms-manual.acting" xreflabel="Action on Massif's Information"> |
| <title>Acting on Massif's Information</title> |
| <para>Massif's information is generally fairly easy to act upon. The |
| obvious place to start looking is the peak snapshot.</para> |
| |
| <para>It can also be useful to look at the overall shape of the graph, to |
| see if memory usage climbs and falls as you expect; spikes in the graph |
| might be worth investigating.</para> |
| |
| <para>The detailed snapshots can get quite large. It is worth viewing them |
| in a very wide window. It's also a good idea to view them with a text |
| editor. That makes it easy to scroll up and down while keeping the cursor |
| in a particular column, which makes following the allocation chains easier. |
| </para> |
| |
| </sect2> |
| |
| </sect1> |
| |
| |
| <sect1 id="ms-manual.options" xreflabel="Massif Command-line Options"> |
| <title>Massif Command-line Options</title> |
| |
| <para>Massif-specific command-line options are:</para> |
| |
| <!-- start of xi:include in the manpage --> |
| <variablelist id="ms.opts.list"> |
| |
| <varlistentry id="opt.heap" xreflabel="--heap"> |
| <term> |
| <option><![CDATA[--heap=<yes|no> [default: yes] ]]></option> |
| </term> |
| <listitem> |
| <para>Specifies whether heap profiling should be done.</para> |
| </listitem> |
| </varlistentry> |
| |
| <varlistentry id="opt.heap-admin" xreflabel="--heap-admin"> |
| <term> |
| <option><![CDATA[--heap-admin=<size> [default: 8] ]]></option> |
| </term> |
| <listitem> |
| <para>If heap profiling is enabled, gives the number of administrative |
| bytes per block to use. This should be an estimate of the average, |
| since it may vary. For example, the allocator used by |
| glibc on Linux requires somewhere between 4 to |
| 15 bytes per block, depending on various factors. That allocator also |
| requires admin space for freed blocks, but Massif cannot |
| account for this.</para> |
| </listitem> |
| </varlistentry> |
| |
| <varlistentry id="opt.stacks" xreflabel="--stacks"> |
| <term> |
| <option><![CDATA[--stacks=<yes|no> [default: no] ]]></option> |
| </term> |
| <listitem> |
| <para>Specifies whether stack profiling should be done. This option |
| slows Massif down greatly, and so is off by default. Note that Massif |
| assumes that the main stack has size zero at start-up. This is not |
| true, but doing otherwise accurately is difficult. Furthermore, |
| starting at zero better indicates the size of the part of the main |
| stack that a user program actually has control over.</para> |
| </listitem> |
| </varlistentry> |
| |
| <varlistentry id="opt.depth" xreflabel="--depth"> |
| <term> |
| <option><![CDATA[--depth=<number> [default: 30] ]]></option> |
| </term> |
| <listitem> |
| <para>Maximum depth of the allocation trees recorded for detailed |
| snapshots. Increasing it will make Massif run somewhat more slowly, |
| use more memory, and produce bigger output files.</para> |
| </listitem> |
| </varlistentry> |
| |
| <varlistentry id="opt.alloc-fn" xreflabel="--alloc-fn"> |
| <term> |
| <option><![CDATA[--alloc-fn=<name> ]]></option> |
| </term> |
| <listitem> |
| <para>Functions specified with this option will be treated as though |
| they were a heap allocation function such as |
| <function>malloc</function>. This is useful for functions that are |
| wrappers to <function>malloc</function> or <function>new</function>, |
| which can fill up the allocation trees with uninteresting information. |
| This option can be specified multiple times on the command line, to |
| name multiple functions.</para> |
| |
| <para>Note that the named function will only be treated this way if it is |
| the top entry in a stack trace, or just below another function treated |
| this way. For example, if you have a function |
| <function>malloc1</function> that wraps <function>malloc</function>, |
| and <function>malloc2</function> that wraps |
| <function>malloc1</function>, just specifying |
| <option>--alloc-fn=malloc2</option> will have no effect. You need to |
| specify <option>--alloc-fn=malloc1</option> as well. This is a little |
| inconvenient, but the reason is that checking for allocation functions |
| is slow, and it saves a lot of time if Massif can stop looking through |
| the stack trace entries as soon as it finds one that doesn't match |
| rather than having to continue through all the entries.</para> |
| |
| <para>Note that C++ names are demangled. Note also that overloaded |
| C++ names must be written in full. Single quotes may be necessary to |
| prevent the shell from breaking them up. For example: |
| <screen><![CDATA[ |
| --alloc-fn='operator new(unsigned, std::nothrow_t const&)' |
| ]]></screen> |
| </para> |
| </listitem> |
| </varlistentry> |
| |
| <varlistentry id="opt.ignore-fn" xreflabel="--ignore-fn"> |
| <term> |
| <option><![CDATA[--ignore-fn=<name> ]]></option> |
| </term> |
| <listitem> |
| <para>Any direct heap allocation (i.e. a call to |
| <function>malloc</function>, <function>new</function>, etc, or a call |
| to a function named by an <option>--alloc-fn</option> |
| option) that occurs in a function specified by this option will be |
| ignored. This is mostly useful for testing purposes. This option can |
| be specified multiple times on the command line, to name multiple |
| functions. |
| </para> |
| |
| <para>Any <function>realloc</function> of an ignored block will |
| also be ignored, even if the <function>realloc</function> call does |
| not occur in an ignored function. This avoids the possibility of |
| negative heap sizes if ignored blocks are shrunk with |
| <function>realloc</function>. |
| </para> |
| |
| <para>The rules for writing C++ function names are the same as |
| for <option>--alloc-fn</option> above. |
| </para> |
| </listitem> |
| </varlistentry> |
| |
| <varlistentry id="opt.threshold" xreflabel="--threshold"> |
| <term> |
| <option><![CDATA[--threshold=<m.n> [default: 1.0] ]]></option> |
| </term> |
| <listitem> |
| <para>The significance threshold for heap allocations, as a |
| percentage of total memory size. Allocation tree entries that account |
| for less than this will be aggregated. Note that this should be |
| specified in tandem with ms_print's option of the same name.</para> |
| </listitem> |
| </varlistentry> |
| |
| <varlistentry id="opt.peak-inaccuracy" xreflabel="--peak-inaccuracy"> |
| <term> |
| <option><![CDATA[--peak-inaccuracy=<m.n> [default: 1.0] ]]></option> |
| </term> |
| <listitem> |
| <para>Massif does not necessarily record the actual global memory |
| allocation peak; by default it records a peak only when the global |
| memory allocation size exceeds the previous peak by at least 1.0%. |
| This is because there can be many local allocation peaks along the way, |
| and doing a detailed snapshot for every one would be expensive and |
| wasteful, as all but one of them will be later discarded. This |
| inaccuracy can be changed (even to 0.0%) via this option, but Massif |
| will run drastically slower as the number approaches zero.</para> |
| </listitem> |
| </varlistentry> |
| |
| <varlistentry id="opt.time-unit" xreflabel="--time-unit"> |
| <term> |
| <option><![CDATA[--time-unit=<i|ms|B> [default: i] ]]></option> |
| </term> |
| <listitem> |
| <para>The time unit used for the profiling. There are three |
| possibilities: instructions executed (i), which is good for most |
| cases; real (wallclock) time (ms, i.e. milliseconds), which is |
| sometimes useful; and bytes allocated/deallocated on the heap and/or |
| stack (B), which is useful for very short-run programs, and for |
| testing purposes, because it is the most reproducible across different |
| machines.</para> </listitem> |
| </varlistentry> |
| |
| <varlistentry id="opt.detailed-freq" xreflabel="--detailed-freq"> |
| <term> |
| <option><![CDATA[--detailed-freq=<n> [default: 10] ]]></option> |
| </term> |
| <listitem> |
| <para>Frequency of detailed snapshots. With |
| <option>--detailed-freq=1</option>, every snapshot is |
| detailed.</para> |
| </listitem> |
| </varlistentry> |
| |
| <varlistentry id="opt.max-snapshots" xreflabel="--max-snapshots"> |
| <term> |
| <option><![CDATA[--max-snapshots=<n> [default: 100] ]]></option> |
| </term> |
| <listitem> |
| <para>The maximum number of snapshots recorded. If set to N, for all |
| programs except very short-running ones, the final number of snapshots |
| will be between N/2 and N.</para> |
| </listitem> |
| </varlistentry> |
| |
| <varlistentry id="opt.massif-out-file" xreflabel="--massif-out-file"> |
| <term> |
| <option><![CDATA[--massif-out-file=<file> [default: massif.out.%p] ]]></option> |
| </term> |
| <listitem> |
| <para>Write the profile data to <computeroutput>file</computeroutput> |
| rather than to the default output file, |
| <computeroutput>massif.out.<pid></computeroutput>. The |
| <option>%p</option> and <option>%q</option> format specifiers can be |
| used to embed the process ID and/or the contents of an environment |
| variable in the name, as is the case for the core option |
| <option><xref linkend="opt.log-file"/></option>. |
| </para> |
| </listitem> |
| </varlistentry> |
| |
| </variablelist> |
| <!-- end of xi:include in the manpage --> |
| |
| </sect1> |
| |
| |
| <sect1 id="ms-manual.clientreqs" xreflabel="Client requests"> |
| <title>Massif Client Requests</title> |
| |
| <para>Massif does not have a <filename>massif.h</filename> file, but it does |
| implement two of the core client requests: |
| <function>VALGRIND_MALLOCLIKE_BLOCK</function> and |
| <function>VALGRIND_FREELIKE_BLOCK</function>; they are described in |
| <xref linkend="manual-core-adv.clientreq"/>. |
| </para> |
| |
| </sect1> |
| |
| |
| <sect1 id="ms-manual.ms_print-options" xreflabel="ms_print Command-line Options"> |
| <title>ms_print Command-line Options</title> |
| |
| <para>ms_print's options are:</para> |
| |
| <!-- start of xi:include in the manpage --> |
| <variablelist id="ms_print.opts.list"> |
| |
| <varlistentry> |
| <term> |
| <option><![CDATA[-h --help ]]></option> |
| </term> |
| <listitem> |
| <para>Show the help message.</para> |
| </listitem> |
| </varlistentry> |
| |
| <varlistentry> |
| <term> |
| <option><![CDATA[--version ]]></option> |
| </term> |
| <listitem> |
| <para>Show the version number.</para> |
| </listitem> |
| </varlistentry> |
| |
| <varlistentry> |
| <term> |
| <option><![CDATA[--threshold=<m.n> [default: 1.0] ]]></option> |
| </term> |
| <listitem> |
| <para>Same as Massif's <option>--threshold</option> option, but |
| applied after profiling rather than during.</para> |
| </listitem> |
| </varlistentry> |
| |
| <varlistentry> |
| <term> |
| <option><![CDATA[--x=<4..1000> [default: 72]]]></option> |
| </term> |
| <listitem> |
| <para>Width of the graph, in columns.</para> |
| </listitem> |
| </varlistentry> |
| |
| <varlistentry> |
| <term> |
| <option><![CDATA[--y=<4..1000> [default: 20] ]]></option> |
| </term> |
| <listitem> |
| <para>Height of the graph, in rows.</para> |
| </listitem> |
| </varlistentry> |
| |
| </variablelist> |
| |
| </sect1> |
| |
| <sect1 id="ms-manual.fileformat" xreflabel="fileformat"> |
| <title>Massif's Output File Format</title> |
| <para>Massif's file format is plain text (i.e. not binary) and deliberately |
| easy to read for both humans and machines. Nonetheless, the exact format |
| is not described here. This is because the format is currently very |
| Massif-specific. In the future we hope to make the format more general, and |
| thus suitable for possible use with other tools. Once this has been done, |
| the format will be documented here.</para> |
| |
| </sect1> |
| |
| </chapter> |