blob: 84f0fdb43a82dc09d2b94ba08cdfede66b26530e [file] [log] [blame]
sewardj9829e382005-05-24 14:17:41 +00001
2As of May 2005, Valgrind can produce its output in XML form. The
3intention is to provide an easily parsed, stable format which is
4suitable for GUIs to read.
5
6
7Design goals
8~~~~~~~~~~~~
9
10* Produce XML output which is easily parsed
11
12* Have a stable output format which does not change much over time, so
13 that investments in parser-writing by GUI developers is not lost as
14 new versions of Valgrind appear.
15
16* Have an extensive output format, so that future changes to the
17 format do not break backwards compatibility with existing parsers of
18 it.
19
20* Produce output in a form which suitable for both offline GUIs (run
21 all the way to the end, then examine output) and interactive GUIs
22 (parse XML incrementally, update display as we go).
23
24* Put as much information as possible into the XML and let the GUIs
25 decide what to show the user (a.k.a provide mechanism, not policy).
26
sewardj57d99c52005-06-13 16:44:33 +000027* Make XML which is actually parseable by standard XML tools.
28
sewardj9829e382005-05-24 14:17:41 +000029
30How to use
31~~~~~~~~~~
32
dee6ca7bd2005-08-03 18:58:45 +000033Run with flag --xml=yes. That`s all. Note however several
sewardj9829e382005-05-24 14:17:41 +000034caveats.
35
36* At the present time only Memcheck is supported. The scheme extends
37 easily enough to cover Addrcheck and Helgrind if needed.
38
39* When XML output is selected, various other settings are made.
40 This is in order that the output format is more controlled.
41 The settings which are changed are:
42
43 - Suppression generation is disabled, as that would require user
44 input.
45
46 - Attaching to GDB is disabled for the same reason.
47
48 - The verbosity level is set to 1 (-v).
49
50 - Error limits are disabled. Usually if the program generates a lot
51 of errors, Valgrind slows down and eventually stops collecting
52 them. When outputting XML this is not the case.
53
54 - VEX emulation warnings are not shown.
55
56 - File descriptor leak checking is disabled. This could be
57 re-enabled at some future point.
58
59 - Maximum-detail leak checking is selected (--leak-check=full).
60
61
62The output format
63~~~~~~~~~~~~~~~~~
sewardj9e7212f2005-05-24 15:00:55 +000064For the most part this should be self descriptive. It is printed in a
65sort-of human-readable way for easy understanding. You may want to
66read the rest of this together with the results of "valgrind --xml=yes
67memcheck/tests/xml1" as an example.
sewardj9829e382005-05-24 14:17:41 +000068
69All tags are balanced: a <foo> tag is always closed by </foo>. Hence
70in the description that follows, mention of a tag <foo> implicitly
71means there is a matching closing tag </foo>.
72
73Symbols in CAPITALS are nonterminals in the grammar and are defined
74somewhere below. The root nonterminal is TOPLEVEL.
75
76The following nonterminals are not described further:
77 INT is a 64-bit signed decimal integer.
78 TEXT is arbitrary text.
sewardj9e7212f2005-05-24 15:00:55 +000079 HEX64 is a 64-bit hexadecimal number, with leading "0x".
sewardj9829e382005-05-24 14:17:41 +000080
sewardj57d99c52005-06-13 16:44:33 +000081Text strings are escaped so as to remove the <, > and & characters
82which would otherwise mess up parsing. They are replaced respectively
83with the standard encodings "&lt;", "&gt;" and "&amp;" respectively.
84Note this is not (yet) done throughout, only for function names in
85<frame>..</frame> tags-pairs.
86
sewardj9829e382005-05-24 14:17:41 +000087
88TOPLEVEL
89--------
sewardj57d99c52005-06-13 16:44:33 +000090
91The first line output is always this:
92
93 <?xml version="1.0"?>
94
95All remaining output is contained within the tag-pair
96<valgrindoutput>.
sewardj9829e382005-05-24 14:17:41 +000097
98Inside that, the first entity is an indication of the protocol
99version. This is provided so that existing parsers can identify XML
100created by future versions of Valgrind merely by observing that the
dee6ca7bd2005-08-03 18:58:45 +0000101protocol version is one they don`t understand. Hence TOPLEVEL is:
sewardj9829e382005-05-24 14:17:41 +0000102
sewardj8665d8e2005-06-01 17:35:23 +0000103 <?xml version="1.0"?>
sewardj9829e382005-05-24 14:17:41 +0000104 <valgrindoutput>
105 <protocolversion>INT<protocolversion>
106 VERSION1STUFF
107 </valgrindoutput>
108
109The only currently defined protocol version number is 1. This
110document only defines protocol version 1.
111
112
113VERSION1STUFF
114-------------
115This is the main top-level construction. Roughly speaking, it
116contains a load of preamble, the errors from the run of the
117program, and the result of the final leak check. Hence the
118following in sequence:
119
120* Various preamble lines which give version info for the various
121 components. The text in them can be anything; it is not intended
122 for interpretation by the GUI:
123
sewardj57d99c52005-06-13 16:44:33 +0000124 <preamble>
125 <line>Misc version/copyright text</line> (zero or more of)
126 </preamble>
sewardj9829e382005-05-24 14:17:41 +0000127
128* The PID of this process and of its parent:
129
130 <pid>INT</pid>
131 <ppid>INT</ppid>
132
133* The name of the tool being used:
134
135 <tool>TEXT</tool>
136
sewardjad311162005-07-19 11:25:02 +0000137* OPTIONALLY, if --log-file-qualifier=VAR flag was given:
138
139 <logfilequalifier> <var>VAR</var> <value>$VAR</value>
140 </logfilequalifier>
141
142 That is, both the name of the environment variable and its value
143 are given.
144
sewardje5e1f822005-07-19 14:59:41 +0000145* OPTIONALLY, if --xml-user-comment=STRING was given:
146
147 <usercomment>STRING</usercomment>
148
149 STRING is not escaped in any way, so that it itself may be a piece
150 of XML with arbitrary tags etc.
151
sewardjb8a3dac2005-07-19 12:39:11 +0000152* The program and args: first those pertaining to Valgrind itself, and
153 then those pertaining to the program to be run under Valgrind (the
154 client):
sewardj9829e382005-05-24 14:17:41 +0000155
sewardjb8a3dac2005-07-19 12:39:11 +0000156 <args>
157 <vargv>
158 <exe>TEXT</exe>
159 <arg>TEXT</arg> (zero or more of)
160 </vargv>
161 <argv>
162 <exe>TEXT</exe>
163 <arg>TEXT</arg> (zero or more of)
164 </argv>
165 </args>
sewardj9829e382005-05-24 14:17:41 +0000166
167* The following, indicating that the program has now started:
168
sewardj33e60422005-07-24 07:33:15 +0000169 <status> <state>RUNNING</state>
170 <time>human-readable-time-string</time>
sewardj68cde6f2005-07-19 12:17:51 +0000171 </status>
sewardj9829e382005-05-24 14:17:41 +0000172
173* Zero or more of (either ERROR or ERRORCOUNTS).
174
175* The following, indicating that the program has now finished, and
176 that the wrapup (leak checking) is happening.
177
sewardj33e60422005-07-24 07:33:15 +0000178 <status> <state>FINISHED</state>
179 <time>human-readable-time-string</time>
sewardj68cde6f2005-07-19 12:17:51 +0000180 </status>
sewardj9829e382005-05-24 14:17:41 +0000181
182* SUPPCOUNTS, indicating how many times each suppression was used.
183
184* Zero or more ERRORs, each of which is a complaint from the
185 leak checker.
186
dee6ca7bd2005-08-03 18:58:45 +0000187That`s it.
sewardj9829e382005-05-24 14:17:41 +0000188
189
190ERROR
191-----
192This shows an error, and is the most complex nonterminal. The format
193is as follows:
194
195 <error>
196 <unique>HEX64</unique>
197 <tid>INT</tid>
198 <kind>KIND</kind>
199 <what>TEXT</what>
200
201 optionally: <leakedbytes>INT</leakedbytes>
202 optionally: <leakedblocks>INT</leakedblocks>
203
204 STACK
205
206 optionally: <auxwhat>TEXT</auxwhat>
207 optionally: STACK
208
209 </error>
210
211* Each error contains a unique, arbitrary 64-bit hex number. This is
212 used to refer to the error in ERRORCOUNTS nonterminals (see below).
213
214* The <tid> tag indicates the Valgrind thread number. This value
215 is arbitrary but may be used to determine which threads produced
216 which errors (at least, the first instance of each error).
217
218* The <kind> tag specifies one of a small number of fixed error
219 types (enumerated below), so that GUIs may roughly categorise
220 errors by type if they want.
221
222* The <what> tag gives a human-understandable description of the
223 error.
224
225* For <kind> tags specifying a KIND of the form "Leak_*", the
226 optional <leakedbytes> and <leakedblocks> indicate the number of
227 bytes and blocks leaked by this error.
228
229* The primary STACK for this error, indicating where it occurred.
230
231* Some error types may have auxiliary information attached:
232
233 <auxwhat>TEXT</auxwhat> gives an auxiliary human-readable
234 description (usually of invalid addresses)
235
236 STACK gives an auxiliary stack (usually the allocation/free
237 point of a block). If this STACK is present then
238 <auxwhat>TEXT</auxwhat> will precede it.
239
240
241KIND
242----
243This is a small enumeration indicating roughly the nature of an error.
244The possible values are:
245
246 InvalidFree
247
248 free/delete/delete[] on an invalid pointer
249
250 MismatchedFree
251
252 free/delete/delete[] does not match allocation function
253 (eg doing new[] then free on the result)
254
255 InvalidRead
256
257 read of an invalid address
258
259 InvalidWrite
260
261 write of an invalid address
262
263 InvalidJump
264
265 jump to an invalid address
266
267 Overlap
268
269 args overlap other otherwise bogus in eg memcpy
270
271 InvalidMemPool
272
273 invalid mem pool specified in client request
274
275 UninitCondition
276
277 conditional jump/move depends on undefined value
278
279 UninitValue
280
281 other use of undefined value (primarily memory addresses)
282
283 SyscallParam
284
285 system call params are undefined or point to
286 undefined/unaddressible memory
287
288 ClientCheck
289
290 "error" resulting from a client check request
291
292 Leak_DefinitelyLost
293
294 memory leak; the referenced blocks are definitely lost
295
296 Leak_IndirectlyLost
297
298 memory leak; the referenced blocks are lost because all pointers
299 to them are also in leaked blocks
300
301 Leak_PossiblyLost
302
303 memory leak; only interior pointers to referenced blocks were
304 found
305
306 Leak_StillReachable
307
308 memory leak; pointers to un-freed blocks are still available
309
310
311STACK
312-----
313STACK indicates locations in the program being debugged. A STACK
314is one or more FRAMEs. The first is the innermost frame, the
315next its caller, etc.
316
317 <stack>
318 one or more FRAME
319 </stack>
320
321
322FRAME
323-----
324FRAME records a single program location:
325
326 <frame>
327 <ip>HEX64</ip>
328 optionally <obj>TEXT</obj>
329 optionally <fn>TEXT</fn>
sewardj57d99c52005-06-13 16:44:33 +0000330 optionally <dir>TEXT</dir>
sewardj9829e382005-05-24 14:17:41 +0000331 optionally <file>TEXT</file>
332 optionally <line>INT</line>
333 </frame>
334
335Only the <ip> field is guaranteed to be present. It indicates a
336code ("instruction pointer") address.
337
338The optional fields, if present, appear in the order stated:
339
340* obj: gives the name of the ELF object containing the code address
341
342* fn: gives the name of the function containing the code address
343
sewardj57d99c52005-06-13 16:44:33 +0000344* dir: gives the source directory associated with the name specified
345 by <file>. Note the current implementation often does not
346 put anything useful in this field.
347
sewardj9829e382005-05-24 14:17:41 +0000348* file: gives the name of the source file containing the code address
349
350* line: gives the line number in the source file
351
352
353ERRORCOUNTS
354-----------
355This specifies, for each error that has been so far presented,
356the number of occurrences of that error.
357
358 <errorcounts>
359 zero or more of
360 <pair> <count>INT</count> <unique>HEX64</unique> </pair>
361 </errorcounts>
362
363Each <pair> gives the current error count <count> for the error with
364unique tag </unique>. The counts do not have to give a count for each
365error so far presented - partial information is allowable.
366
sewardj9e7212f2005-05-24 15:00:55 +0000367As at Valgrind rev 3793, error counts are only emitted at program
sewardj9829e382005-05-24 14:17:41 +0000368termination. However, it is perfectly acceptable to periodically emit
369error counts as the program is running. Doing so would facilitate a
370GUI to dynamically update its error-count display as the program runs.
371
372
373SUPPCOUNTS
374----------
375A SUPPCOUNTS block appears exactly once, after the program terminates.
376It specifies the number of times each error-suppression was used.
377Suppressions not mentioned were used zero times.
378
379 <suppcounts>
380 zero or more of
sewardj7c9e57c2005-05-24 14:21:45 +0000381 <pair> <count>INT</count> <name>TEXT</name> </pair>
sewardj9829e382005-05-24 14:17:41 +0000382 </suppcounts>
383
384The <name> is as specified in the suppression name fields in .supp
385files.
sewardj57d99c52005-06-13 16:44:33 +0000386