blob: c232aacda790f70f5be2a1c48a02dc18b99a455e [file] [log] [blame]
sewardj9829e382005-05-24 14:17:41 +00001
2As of May 2005, Valgrind can produce its output in XML form. The
3intention is to provide an easily parsed, stable format which is
4suitable for GUIs to read.
5
6
7Design goals
8~~~~~~~~~~~~
9
10* Produce XML output which is easily parsed
11
12* Have a stable output format which does not change much over time, so
13 that investments in parser-writing by GUI developers is not lost as
14 new versions of Valgrind appear.
15
16* Have an extensive output format, so that future changes to the
17 format do not break backwards compatibility with existing parsers of
18 it.
19
20* Produce output in a form which suitable for both offline GUIs (run
21 all the way to the end, then examine output) and interactive GUIs
22 (parse XML incrementally, update display as we go).
23
24* Put as much information as possible into the XML and let the GUIs
25 decide what to show the user (a.k.a provide mechanism, not policy).
26
sewardj57d99c52005-06-13 16:44:33 +000027* Make XML which is actually parseable by standard XML tools.
28
sewardj9829e382005-05-24 14:17:41 +000029
30How to use
31~~~~~~~~~~
32
dee6ca7bd2005-08-03 18:58:45 +000033Run with flag --xml=yes. That`s all. Note however several
sewardj9829e382005-05-24 14:17:41 +000034caveats.
35
36* At the present time only Memcheck is supported. The scheme extends
njn1d0825f2006-03-27 11:37:07 +000037 easily enough to cover Helgrind if needed.
sewardj9829e382005-05-24 14:17:41 +000038
39* When XML output is selected, various other settings are made.
40 This is in order that the output format is more controlled.
41 The settings which are changed are:
42
43 - Suppression generation is disabled, as that would require user
44 input.
45
46 - Attaching to GDB is disabled for the same reason.
47
48 - The verbosity level is set to 1 (-v).
49
50 - Error limits are disabled. Usually if the program generates a lot
51 of errors, Valgrind slows down and eventually stops collecting
52 them. When outputting XML this is not the case.
53
54 - VEX emulation warnings are not shown.
55
56 - File descriptor leak checking is disabled. This could be
57 re-enabled at some future point.
58
59 - Maximum-detail leak checking is selected (--leak-check=full).
60
61
62The output format
63~~~~~~~~~~~~~~~~~
sewardj9e7212f2005-05-24 15:00:55 +000064For the most part this should be self descriptive. It is printed in a
65sort-of human-readable way for easy understanding. You may want to
66read the rest of this together with the results of "valgrind --xml=yes
67memcheck/tests/xml1" as an example.
sewardj9829e382005-05-24 14:17:41 +000068
69All tags are balanced: a <foo> tag is always closed by </foo>. Hence
70in the description that follows, mention of a tag <foo> implicitly
71means there is a matching closing tag </foo>.
72
73Symbols in CAPITALS are nonterminals in the grammar and are defined
74somewhere below. The root nonterminal is TOPLEVEL.
75
76The following nonterminals are not described further:
77 INT is a 64-bit signed decimal integer.
78 TEXT is arbitrary text.
sewardj9e7212f2005-05-24 15:00:55 +000079 HEX64 is a 64-bit hexadecimal number, with leading "0x".
sewardj9829e382005-05-24 14:17:41 +000080
sewardj57d99c52005-06-13 16:44:33 +000081Text strings are escaped so as to remove the <, > and & characters
82which would otherwise mess up parsing. They are replaced respectively
83with the standard encodings "&lt;", "&gt;" and "&amp;" respectively.
84Note this is not (yet) done throughout, only for function names in
85<frame>..</frame> tags-pairs.
86
sewardj9829e382005-05-24 14:17:41 +000087
88TOPLEVEL
89--------
sewardj57d99c52005-06-13 16:44:33 +000090
91The first line output is always this:
92
93 <?xml version="1.0"?>
94
95All remaining output is contained within the tag-pair
96<valgrindoutput>.
sewardj9829e382005-05-24 14:17:41 +000097
98Inside that, the first entity is an indication of the protocol
99version. This is provided so that existing parsers can identify XML
100created by future versions of Valgrind merely by observing that the
dee6ca7bd2005-08-03 18:58:45 +0000101protocol version is one they don`t understand. Hence TOPLEVEL is:
sewardj9829e382005-05-24 14:17:41 +0000102
sewardj8665d8e2005-06-01 17:35:23 +0000103 <?xml version="1.0"?>
sewardj9829e382005-05-24 14:17:41 +0000104 <valgrindoutput>
105 <protocolversion>INT<protocolversion>
sewardj6a5a69c2005-11-17 00:51:36 +0000106 PROTOCOL
sewardj9829e382005-05-24 14:17:41 +0000107 </valgrindoutput>
108
sewardjb8b79ad2008-03-03 01:35:41 +0000109Valgrind versions 3.0.0 and 3.0.1 emit protocol version 1. Versions
sewardj7cf4e6b2008-05-01 20:24:26 +00001103.1.X and 3.2.X emit protocol version 2. 3.4.X emits protocol version
1113.
sewardjb8b79ad2008-03-03 01:35:41 +0000112
113
114PROTOCOL for version 3
115----------------------
sewardj7cf4e6b2008-05-01 20:24:26 +0000116Changes in 3.4.X (tentative): (jrs, 1 March 2008)
sewardjb8b79ad2008-03-03 01:35:41 +0000117
sewardj4efbaa72008-06-04 06:51:58 +0000118* There may be more than one <logfilequalifier> clause.
sewardjb8b79ad2008-03-03 01:35:41 +0000119
120* Some errors may have two <auxwhat> blocks, rather than just one
121 (resulting from merge of the DATASYMS branch)
sewardj9829e382005-05-24 14:17:41 +0000122
sewardj7cf4e6b2008-05-01 20:24:26 +0000123* Some errors may have an ORIGIN component, indicating the origins of
124 uninitialised values. This results from the merge of the
125 OTRACK_BY_INSTRUMENTATION branch.
126
sewardj9829e382005-05-24 14:17:41 +0000127
sewardj6a5a69c2005-11-17 00:51:36 +0000128PROTOCOL for version 2
129----------------------
130Version 2 is identical in every way to version 1, except that the time
131string in
132
133 <time>human-readable-time-string</time>
134
135has changed format, and is also elapsed wallclock time since process
136start, and not local time or any such. In fact version 1 does not
137define the format of the string so in some ways this revision is
138irrelevant.
139
140
141PROTOCOL for version 1
142----------------------
sewardj9829e382005-05-24 14:17:41 +0000143This is the main top-level construction. Roughly speaking, it
144contains a load of preamble, the errors from the run of the
145program, and the result of the final leak check. Hence the
146following in sequence:
147
148* Various preamble lines which give version info for the various
149 components. The text in them can be anything; it is not intended
150 for interpretation by the GUI:
151
sewardj57d99c52005-06-13 16:44:33 +0000152 <preamble>
153 <line>Misc version/copyright text</line> (zero or more of)
154 </preamble>
sewardj9829e382005-05-24 14:17:41 +0000155
156* The PID of this process and of its parent:
157
158 <pid>INT</pid>
159 <ppid>INT</ppid>
160
161* The name of the tool being used:
162
163 <tool>TEXT</tool>
164
sewardjad311162005-07-19 11:25:02 +0000165* OPTIONALLY, if --log-file-qualifier=VAR flag was given:
166
167 <logfilequalifier> <var>VAR</var> <value>$VAR</value>
168 </logfilequalifier>
169
170 That is, both the name of the environment variable and its value
171 are given.
njn374a36d2007-11-23 01:41:32 +0000172 [update: as of v3.3.0, this is not present, as the --log-file-qualifier
173 option has been removed, replaced by the %q format specifier in --log-file.]
sewardjad311162005-07-19 11:25:02 +0000174
sewardje5e1f822005-07-19 14:59:41 +0000175* OPTIONALLY, if --xml-user-comment=STRING was given:
176
177 <usercomment>STRING</usercomment>
178
179 STRING is not escaped in any way, so that it itself may be a piece
180 of XML with arbitrary tags etc.
181
sewardjb8a3dac2005-07-19 12:39:11 +0000182* The program and args: first those pertaining to Valgrind itself, and
183 then those pertaining to the program to be run under Valgrind (the
184 client):
sewardj9829e382005-05-24 14:17:41 +0000185
sewardjb8a3dac2005-07-19 12:39:11 +0000186 <args>
187 <vargv>
188 <exe>TEXT</exe>
189 <arg>TEXT</arg> (zero or more of)
190 </vargv>
191 <argv>
192 <exe>TEXT</exe>
193 <arg>TEXT</arg> (zero or more of)
194 </argv>
195 </args>
sewardj9829e382005-05-24 14:17:41 +0000196
197* The following, indicating that the program has now started:
198
sewardj33e60422005-07-24 07:33:15 +0000199 <status> <state>RUNNING</state>
200 <time>human-readable-time-string</time>
sewardj68cde6f2005-07-19 12:17:51 +0000201 </status>
sewardj9829e382005-05-24 14:17:41 +0000202
203* Zero or more of (either ERROR or ERRORCOUNTS).
204
205* The following, indicating that the program has now finished, and
206 that the wrapup (leak checking) is happening.
207
sewardj33e60422005-07-24 07:33:15 +0000208 <status> <state>FINISHED</state>
209 <time>human-readable-time-string</time>
sewardj68cde6f2005-07-19 12:17:51 +0000210 </status>
sewardj9829e382005-05-24 14:17:41 +0000211
212* SUPPCOUNTS, indicating how many times each suppression was used.
213
214* Zero or more ERRORs, each of which is a complaint from the
215 leak checker.
216
sewardj6a5a69c2005-11-17 00:51:36 +0000217That's it.
sewardj9829e382005-05-24 14:17:41 +0000218
219
220ERROR
221-----
222This shows an error, and is the most complex nonterminal. The format
223is as follows:
224
225 <error>
226 <unique>HEX64</unique>
227 <tid>INT</tid>
228 <kind>KIND</kind>
229 <what>TEXT</what>
230
231 optionally: <leakedbytes>INT</leakedbytes>
232 optionally: <leakedblocks>INT</leakedblocks>
233
234 STACK
235
236 optionally: <auxwhat>TEXT</auxwhat>
237 optionally: STACK
sewardj7cf4e6b2008-05-01 20:24:26 +0000238 optionally: ORIGIN
sewardj9829e382005-05-24 14:17:41 +0000239
240 </error>
241
242* Each error contains a unique, arbitrary 64-bit hex number. This is
243 used to refer to the error in ERRORCOUNTS nonterminals (see below).
244
245* The <tid> tag indicates the Valgrind thread number. This value
246 is arbitrary but may be used to determine which threads produced
247 which errors (at least, the first instance of each error).
248
249* The <kind> tag specifies one of a small number of fixed error
250 types (enumerated below), so that GUIs may roughly categorise
251 errors by type if they want.
252
253* The <what> tag gives a human-understandable description of the
254 error.
255
256* For <kind> tags specifying a KIND of the form "Leak_*", the
257 optional <leakedbytes> and <leakedblocks> indicate the number of
258 bytes and blocks leaked by this error.
259
260* The primary STACK for this error, indicating where it occurred.
261
262* Some error types may have auxiliary information attached:
263
264 <auxwhat>TEXT</auxwhat> gives an auxiliary human-readable
265 description (usually of invalid addresses)
266
267 STACK gives an auxiliary stack (usually the allocation/free
268 point of a block). If this STACK is present then
269 <auxwhat>TEXT</auxwhat> will precede it.
270
271
272KIND
273----
274This is a small enumeration indicating roughly the nature of an error.
275The possible values are:
276
277 InvalidFree
278
279 free/delete/delete[] on an invalid pointer
280
281 MismatchedFree
282
283 free/delete/delete[] does not match allocation function
284 (eg doing new[] then free on the result)
285
286 InvalidRead
287
288 read of an invalid address
289
290 InvalidWrite
291
292 write of an invalid address
293
294 InvalidJump
295
296 jump to an invalid address
297
298 Overlap
299
300 args overlap other otherwise bogus in eg memcpy
301
302 InvalidMemPool
303
304 invalid mem pool specified in client request
305
306 UninitCondition
307
308 conditional jump/move depends on undefined value
309
310 UninitValue
311
312 other use of undefined value (primarily memory addresses)
313
314 SyscallParam
315
316 system call params are undefined or point to
317 undefined/unaddressible memory
318
319 ClientCheck
320
321 "error" resulting from a client check request
322
323 Leak_DefinitelyLost
324
325 memory leak; the referenced blocks are definitely lost
326
327 Leak_IndirectlyLost
328
329 memory leak; the referenced blocks are lost because all pointers
330 to them are also in leaked blocks
331
332 Leak_PossiblyLost
333
334 memory leak; only interior pointers to referenced blocks were
335 found
336
337 Leak_StillReachable
338
339 memory leak; pointers to un-freed blocks are still available
340
341
342STACK
343-----
344STACK indicates locations in the program being debugged. A STACK
345is one or more FRAMEs. The first is the innermost frame, the
346next its caller, etc.
347
348 <stack>
349 one or more FRAME
350 </stack>
351
352
353FRAME
354-----
355FRAME records a single program location:
356
357 <frame>
358 <ip>HEX64</ip>
359 optionally <obj>TEXT</obj>
360 optionally <fn>TEXT</fn>
sewardj57d99c52005-06-13 16:44:33 +0000361 optionally <dir>TEXT</dir>
sewardj9829e382005-05-24 14:17:41 +0000362 optionally <file>TEXT</file>
363 optionally <line>INT</line>
364 </frame>
365
366Only the <ip> field is guaranteed to be present. It indicates a
367code ("instruction pointer") address.
368
369The optional fields, if present, appear in the order stated:
370
371* obj: gives the name of the ELF object containing the code address
372
373* fn: gives the name of the function containing the code address
374
sewardj57d99c52005-06-13 16:44:33 +0000375* dir: gives the source directory associated with the name specified
376 by <file>. Note the current implementation often does not
377 put anything useful in this field.
378
sewardj9829e382005-05-24 14:17:41 +0000379* file: gives the name of the source file containing the code address
380
381* line: gives the line number in the source file
382
383
sewardj7cf4e6b2008-05-01 20:24:26 +0000384ORIGIN
385------
386ORIGIN shows the origin of uninitialised data in errors that involve
387uninitialised data. STACK shows the origin of the uninitialised
388value. TEXT gives a human-understandable hint as to the meaning of
389the information in STACK.
390
391 <origin>
392 <what>TEXT<what>
393 STACK
394 </origin>
395
396
sewardj9829e382005-05-24 14:17:41 +0000397ERRORCOUNTS
398-----------
399This specifies, for each error that has been so far presented,
400the number of occurrences of that error.
401
402 <errorcounts>
403 zero or more of
404 <pair> <count>INT</count> <unique>HEX64</unique> </pair>
405 </errorcounts>
406
407Each <pair> gives the current error count <count> for the error with
408unique tag </unique>. The counts do not have to give a count for each
409error so far presented - partial information is allowable.
410
sewardj9e7212f2005-05-24 15:00:55 +0000411As at Valgrind rev 3793, error counts are only emitted at program
sewardj9829e382005-05-24 14:17:41 +0000412termination. However, it is perfectly acceptable to periodically emit
413error counts as the program is running. Doing so would facilitate a
414GUI to dynamically update its error-count display as the program runs.
415
416
417SUPPCOUNTS
418----------
419A SUPPCOUNTS block appears exactly once, after the program terminates.
420It specifies the number of times each error-suppression was used.
421Suppressions not mentioned were used zero times.
422
423 <suppcounts>
424 zero or more of
sewardj7c9e57c2005-05-24 14:21:45 +0000425 <pair> <count>INT</count> <name>TEXT</name> </pair>
sewardj9829e382005-05-24 14:17:41 +0000426 </suppcounts>
427
428The <name> is as specified in the suppression name fields in .supp
429files.
sewardj57d99c52005-06-13 16:44:33 +0000430