blob: 2ad21c323757d006de5218c9b3ffe6ca61caa364 [file] [log] [blame]
sewardj9829e382005-05-24 14:17:41 +00001
2As of May 2005, Valgrind can produce its output in XML form. The
3intention is to provide an easily parsed, stable format which is
4suitable for GUIs to read.
5
6
7Design goals
8~~~~~~~~~~~~
9
10* Produce XML output which is easily parsed
11
12* Have a stable output format which does not change much over time, so
13 that investments in parser-writing by GUI developers is not lost as
14 new versions of Valgrind appear.
15
16* Have an extensive output format, so that future changes to the
17 format do not break backwards compatibility with existing parsers of
18 it.
19
20* Produce output in a form which suitable for both offline GUIs (run
21 all the way to the end, then examine output) and interactive GUIs
22 (parse XML incrementally, update display as we go).
23
24* Put as much information as possible into the XML and let the GUIs
25 decide what to show the user (a.k.a provide mechanism, not policy).
26
sewardj57d99c52005-06-13 16:44:33 +000027* Make XML which is actually parseable by standard XML tools.
28
sewardj9829e382005-05-24 14:17:41 +000029
30How to use
31~~~~~~~~~~
32
dee6ca7bd2005-08-03 18:58:45 +000033Run with flag --xml=yes. That`s all. Note however several
sewardj9829e382005-05-24 14:17:41 +000034caveats.
35
36* At the present time only Memcheck is supported. The scheme extends
njn1d0825f2006-03-27 11:37:07 +000037 easily enough to cover Helgrind if needed.
sewardj9829e382005-05-24 14:17:41 +000038
39* When XML output is selected, various other settings are made.
40 This is in order that the output format is more controlled.
41 The settings which are changed are:
42
43 - Suppression generation is disabled, as that would require user
44 input.
45
46 - Attaching to GDB is disabled for the same reason.
47
48 - The verbosity level is set to 1 (-v).
49
50 - Error limits are disabled. Usually if the program generates a lot
51 of errors, Valgrind slows down and eventually stops collecting
52 them. When outputting XML this is not the case.
53
54 - VEX emulation warnings are not shown.
55
56 - File descriptor leak checking is disabled. This could be
57 re-enabled at some future point.
58
59 - Maximum-detail leak checking is selected (--leak-check=full).
60
61
62The output format
63~~~~~~~~~~~~~~~~~
sewardj9e7212f2005-05-24 15:00:55 +000064For the most part this should be self descriptive. It is printed in a
65sort-of human-readable way for easy understanding. You may want to
66read the rest of this together with the results of "valgrind --xml=yes
67memcheck/tests/xml1" as an example.
sewardj9829e382005-05-24 14:17:41 +000068
69All tags are balanced: a <foo> tag is always closed by </foo>. Hence
70in the description that follows, mention of a tag <foo> implicitly
71means there is a matching closing tag </foo>.
72
73Symbols in CAPITALS are nonterminals in the grammar and are defined
74somewhere below. The root nonterminal is TOPLEVEL.
75
76The following nonterminals are not described further:
77 INT is a 64-bit signed decimal integer.
78 TEXT is arbitrary text.
sewardj9e7212f2005-05-24 15:00:55 +000079 HEX64 is a 64-bit hexadecimal number, with leading "0x".
sewardj9829e382005-05-24 14:17:41 +000080
sewardj57d99c52005-06-13 16:44:33 +000081Text strings are escaped so as to remove the <, > and & characters
82which would otherwise mess up parsing. They are replaced respectively
83with the standard encodings "&lt;", "&gt;" and "&amp;" respectively.
84Note this is not (yet) done throughout, only for function names in
85<frame>..</frame> tags-pairs.
86
sewardj9829e382005-05-24 14:17:41 +000087
88TOPLEVEL
89--------
sewardj57d99c52005-06-13 16:44:33 +000090
91The first line output is always this:
92
93 <?xml version="1.0"?>
94
95All remaining output is contained within the tag-pair
96<valgrindoutput>.
sewardj9829e382005-05-24 14:17:41 +000097
98Inside that, the first entity is an indication of the protocol
99version. This is provided so that existing parsers can identify XML
100created by future versions of Valgrind merely by observing that the
dee6ca7bd2005-08-03 18:58:45 +0000101protocol version is one they don`t understand. Hence TOPLEVEL is:
sewardj9829e382005-05-24 14:17:41 +0000102
sewardj8665d8e2005-06-01 17:35:23 +0000103 <?xml version="1.0"?>
sewardj9829e382005-05-24 14:17:41 +0000104 <valgrindoutput>
105 <protocolversion>INT<protocolversion>
sewardj6a5a69c2005-11-17 00:51:36 +0000106 PROTOCOL
sewardj9829e382005-05-24 14:17:41 +0000107 </valgrindoutput>
108
sewardjb8b79ad2008-03-03 01:35:41 +0000109Valgrind versions 3.0.0 and 3.0.1 emit protocol version 1. Versions
sewardj7cf4e6b2008-05-01 20:24:26 +00001103.1.X and 3.2.X emit protocol version 2. 3.4.X emits protocol version
1113.
sewardjb8b79ad2008-03-03 01:35:41 +0000112
113
114PROTOCOL for version 3
115----------------------
sewardj7cf4e6b2008-05-01 20:24:26 +0000116Changes in 3.4.X (tentative): (jrs, 1 March 2008)
sewardjb8b79ad2008-03-03 01:35:41 +0000117
118* There may be more than one <logfilequalifier> clause, depending on
119 how this pans out. (AshleyP perhaps to investigate)
120
121* Some errors may have two <auxwhat> blocks, rather than just one
122 (resulting from merge of the DATASYMS branch)
sewardj9829e382005-05-24 14:17:41 +0000123
sewardj7cf4e6b2008-05-01 20:24:26 +0000124* Some errors may have an ORIGIN component, indicating the origins of
125 uninitialised values. This results from the merge of the
126 OTRACK_BY_INSTRUMENTATION branch.
127
sewardj9829e382005-05-24 14:17:41 +0000128
sewardj6a5a69c2005-11-17 00:51:36 +0000129PROTOCOL for version 2
130----------------------
131Version 2 is identical in every way to version 1, except that the time
132string in
133
134 <time>human-readable-time-string</time>
135
136has changed format, and is also elapsed wallclock time since process
137start, and not local time or any such. In fact version 1 does not
138define the format of the string so in some ways this revision is
139irrelevant.
140
141
142PROTOCOL for version 1
143----------------------
sewardj9829e382005-05-24 14:17:41 +0000144This is the main top-level construction. Roughly speaking, it
145contains a load of preamble, the errors from the run of the
146program, and the result of the final leak check. Hence the
147following in sequence:
148
149* Various preamble lines which give version info for the various
150 components. The text in them can be anything; it is not intended
151 for interpretation by the GUI:
152
sewardj57d99c52005-06-13 16:44:33 +0000153 <preamble>
154 <line>Misc version/copyright text</line> (zero or more of)
155 </preamble>
sewardj9829e382005-05-24 14:17:41 +0000156
157* The PID of this process and of its parent:
158
159 <pid>INT</pid>
160 <ppid>INT</ppid>
161
162* The name of the tool being used:
163
164 <tool>TEXT</tool>
165
sewardjad311162005-07-19 11:25:02 +0000166* OPTIONALLY, if --log-file-qualifier=VAR flag was given:
167
168 <logfilequalifier> <var>VAR</var> <value>$VAR</value>
169 </logfilequalifier>
170
171 That is, both the name of the environment variable and its value
172 are given.
njn374a36d2007-11-23 01:41:32 +0000173 [update: as of v3.3.0, this is not present, as the --log-file-qualifier
174 option has been removed, replaced by the %q format specifier in --log-file.]
sewardjad311162005-07-19 11:25:02 +0000175
sewardje5e1f822005-07-19 14:59:41 +0000176* OPTIONALLY, if --xml-user-comment=STRING was given:
177
178 <usercomment>STRING</usercomment>
179
180 STRING is not escaped in any way, so that it itself may be a piece
181 of XML with arbitrary tags etc.
182
sewardjb8a3dac2005-07-19 12:39:11 +0000183* The program and args: first those pertaining to Valgrind itself, and
184 then those pertaining to the program to be run under Valgrind (the
185 client):
sewardj9829e382005-05-24 14:17:41 +0000186
sewardjb8a3dac2005-07-19 12:39:11 +0000187 <args>
188 <vargv>
189 <exe>TEXT</exe>
190 <arg>TEXT</arg> (zero or more of)
191 </vargv>
192 <argv>
193 <exe>TEXT</exe>
194 <arg>TEXT</arg> (zero or more of)
195 </argv>
196 </args>
sewardj9829e382005-05-24 14:17:41 +0000197
198* The following, indicating that the program has now started:
199
sewardj33e60422005-07-24 07:33:15 +0000200 <status> <state>RUNNING</state>
201 <time>human-readable-time-string</time>
sewardj68cde6f2005-07-19 12:17:51 +0000202 </status>
sewardj9829e382005-05-24 14:17:41 +0000203
204* Zero or more of (either ERROR or ERRORCOUNTS).
205
206* The following, indicating that the program has now finished, and
207 that the wrapup (leak checking) is happening.
208
sewardj33e60422005-07-24 07:33:15 +0000209 <status> <state>FINISHED</state>
210 <time>human-readable-time-string</time>
sewardj68cde6f2005-07-19 12:17:51 +0000211 </status>
sewardj9829e382005-05-24 14:17:41 +0000212
213* SUPPCOUNTS, indicating how many times each suppression was used.
214
215* Zero or more ERRORs, each of which is a complaint from the
216 leak checker.
217
sewardj6a5a69c2005-11-17 00:51:36 +0000218That's it.
sewardj9829e382005-05-24 14:17:41 +0000219
220
221ERROR
222-----
223This shows an error, and is the most complex nonterminal. The format
224is as follows:
225
226 <error>
227 <unique>HEX64</unique>
228 <tid>INT</tid>
229 <kind>KIND</kind>
230 <what>TEXT</what>
231
232 optionally: <leakedbytes>INT</leakedbytes>
233 optionally: <leakedblocks>INT</leakedblocks>
234
235 STACK
236
237 optionally: <auxwhat>TEXT</auxwhat>
238 optionally: STACK
sewardj7cf4e6b2008-05-01 20:24:26 +0000239 optionally: ORIGIN
sewardj9829e382005-05-24 14:17:41 +0000240
241 </error>
242
243* Each error contains a unique, arbitrary 64-bit hex number. This is
244 used to refer to the error in ERRORCOUNTS nonterminals (see below).
245
246* The <tid> tag indicates the Valgrind thread number. This value
247 is arbitrary but may be used to determine which threads produced
248 which errors (at least, the first instance of each error).
249
250* The <kind> tag specifies one of a small number of fixed error
251 types (enumerated below), so that GUIs may roughly categorise
252 errors by type if they want.
253
254* The <what> tag gives a human-understandable description of the
255 error.
256
257* For <kind> tags specifying a KIND of the form "Leak_*", the
258 optional <leakedbytes> and <leakedblocks> indicate the number of
259 bytes and blocks leaked by this error.
260
261* The primary STACK for this error, indicating where it occurred.
262
263* Some error types may have auxiliary information attached:
264
265 <auxwhat>TEXT</auxwhat> gives an auxiliary human-readable
266 description (usually of invalid addresses)
267
268 STACK gives an auxiliary stack (usually the allocation/free
269 point of a block). If this STACK is present then
270 <auxwhat>TEXT</auxwhat> will precede it.
271
272
273KIND
274----
275This is a small enumeration indicating roughly the nature of an error.
276The possible values are:
277
278 InvalidFree
279
280 free/delete/delete[] on an invalid pointer
281
282 MismatchedFree
283
284 free/delete/delete[] does not match allocation function
285 (eg doing new[] then free on the result)
286
287 InvalidRead
288
289 read of an invalid address
290
291 InvalidWrite
292
293 write of an invalid address
294
295 InvalidJump
296
297 jump to an invalid address
298
299 Overlap
300
301 args overlap other otherwise bogus in eg memcpy
302
303 InvalidMemPool
304
305 invalid mem pool specified in client request
306
307 UninitCondition
308
309 conditional jump/move depends on undefined value
310
311 UninitValue
312
313 other use of undefined value (primarily memory addresses)
314
315 SyscallParam
316
317 system call params are undefined or point to
318 undefined/unaddressible memory
319
320 ClientCheck
321
322 "error" resulting from a client check request
323
324 Leak_DefinitelyLost
325
326 memory leak; the referenced blocks are definitely lost
327
328 Leak_IndirectlyLost
329
330 memory leak; the referenced blocks are lost because all pointers
331 to them are also in leaked blocks
332
333 Leak_PossiblyLost
334
335 memory leak; only interior pointers to referenced blocks were
336 found
337
338 Leak_StillReachable
339
340 memory leak; pointers to un-freed blocks are still available
341
342
343STACK
344-----
345STACK indicates locations in the program being debugged. A STACK
346is one or more FRAMEs. The first is the innermost frame, the
347next its caller, etc.
348
349 <stack>
350 one or more FRAME
351 </stack>
352
353
354FRAME
355-----
356FRAME records a single program location:
357
358 <frame>
359 <ip>HEX64</ip>
360 optionally <obj>TEXT</obj>
361 optionally <fn>TEXT</fn>
sewardj57d99c52005-06-13 16:44:33 +0000362 optionally <dir>TEXT</dir>
sewardj9829e382005-05-24 14:17:41 +0000363 optionally <file>TEXT</file>
364 optionally <line>INT</line>
365 </frame>
366
367Only the <ip> field is guaranteed to be present. It indicates a
368code ("instruction pointer") address.
369
370The optional fields, if present, appear in the order stated:
371
372* obj: gives the name of the ELF object containing the code address
373
374* fn: gives the name of the function containing the code address
375
sewardj57d99c52005-06-13 16:44:33 +0000376* dir: gives the source directory associated with the name specified
377 by <file>. Note the current implementation often does not
378 put anything useful in this field.
379
sewardj9829e382005-05-24 14:17:41 +0000380* file: gives the name of the source file containing the code address
381
382* line: gives the line number in the source file
383
384
sewardj7cf4e6b2008-05-01 20:24:26 +0000385ORIGIN
386------
387ORIGIN shows the origin of uninitialised data in errors that involve
388uninitialised data. STACK shows the origin of the uninitialised
389value. TEXT gives a human-understandable hint as to the meaning of
390the information in STACK.
391
392 <origin>
393 <what>TEXT<what>
394 STACK
395 </origin>
396
397
sewardj9829e382005-05-24 14:17:41 +0000398ERRORCOUNTS
399-----------
400This specifies, for each error that has been so far presented,
401the number of occurrences of that error.
402
403 <errorcounts>
404 zero or more of
405 <pair> <count>INT</count> <unique>HEX64</unique> </pair>
406 </errorcounts>
407
408Each <pair> gives the current error count <count> for the error with
409unique tag </unique>. The counts do not have to give a count for each
410error so far presented - partial information is allowable.
411
sewardj9e7212f2005-05-24 15:00:55 +0000412As at Valgrind rev 3793, error counts are only emitted at program
sewardj9829e382005-05-24 14:17:41 +0000413termination. However, it is perfectly acceptable to periodically emit
414error counts as the program is running. Doing so would facilitate a
415GUI to dynamically update its error-count display as the program runs.
416
417
418SUPPCOUNTS
419----------
420A SUPPCOUNTS block appears exactly once, after the program terminates.
421It specifies the number of times each error-suppression was used.
422Suppressions not mentioned were used zero times.
423
424 <suppcounts>
425 zero or more of
sewardj7c9e57c2005-05-24 14:21:45 +0000426 <pair> <count>INT</count> <name>TEXT</name> </pair>
sewardj9829e382005-05-24 14:17:41 +0000427 </suppcounts>
428
429The <name> is as specified in the suppression name fields in .supp
430files.
sewardj57d99c52005-06-13 16:44:33 +0000431