sewardj | 9829e38 | 2005-05-24 14:17:41 +0000 | [diff] [blame] | 1 | |
| 2 | As of May 2005, Valgrind can produce its output in XML form. The |
| 3 | intention is to provide an easily parsed, stable format which is |
| 4 | suitable for GUIs to read. |
| 5 | |
| 6 | |
| 7 | Design goals |
| 8 | ~~~~~~~~~~~~ |
| 9 | |
| 10 | * Produce XML output which is easily parsed |
| 11 | |
| 12 | * Have a stable output format which does not change much over time, so |
| 13 | that investments in parser-writing by GUI developers is not lost as |
| 14 | new versions of Valgrind appear. |
| 15 | |
| 16 | * Have an extensive output format, so that future changes to the |
| 17 | format do not break backwards compatibility with existing parsers of |
| 18 | it. |
| 19 | |
| 20 | * Produce output in a form which suitable for both offline GUIs (run |
| 21 | all the way to the end, then examine output) and interactive GUIs |
| 22 | (parse XML incrementally, update display as we go). |
| 23 | |
| 24 | * Put as much information as possible into the XML and let the GUIs |
| 25 | decide what to show the user (a.k.a provide mechanism, not policy). |
| 26 | |
sewardj | 57d99c5 | 2005-06-13 16:44:33 +0000 | [diff] [blame] | 27 | * Make XML which is actually parseable by standard XML tools. |
| 28 | |
sewardj | 9829e38 | 2005-05-24 14:17:41 +0000 | [diff] [blame] | 29 | |
| 30 | How to use |
| 31 | ~~~~~~~~~~ |
| 32 | |
de | e6ca7bd | 2005-08-03 18:58:45 +0000 | [diff] [blame^] | 33 | Run with flag --xml=yes. That`s all. Note however several |
sewardj | 9829e38 | 2005-05-24 14:17:41 +0000 | [diff] [blame] | 34 | caveats. |
| 35 | |
| 36 | * At the present time only Memcheck is supported. The scheme extends |
| 37 | easily enough to cover Addrcheck and Helgrind if needed. |
| 38 | |
| 39 | * When XML output is selected, various other settings are made. |
| 40 | This is in order that the output format is more controlled. |
| 41 | The settings which are changed are: |
| 42 | |
| 43 | - Suppression generation is disabled, as that would require user |
| 44 | input. |
| 45 | |
| 46 | - Attaching to GDB is disabled for the same reason. |
| 47 | |
| 48 | - The verbosity level is set to 1 (-v). |
| 49 | |
| 50 | - Error limits are disabled. Usually if the program generates a lot |
| 51 | of errors, Valgrind slows down and eventually stops collecting |
| 52 | them. When outputting XML this is not the case. |
| 53 | |
| 54 | - VEX emulation warnings are not shown. |
| 55 | |
| 56 | - File descriptor leak checking is disabled. This could be |
| 57 | re-enabled at some future point. |
| 58 | |
| 59 | - Maximum-detail leak checking is selected (--leak-check=full). |
| 60 | |
| 61 | |
| 62 | The output format |
| 63 | ~~~~~~~~~~~~~~~~~ |
sewardj | 9e7212f | 2005-05-24 15:00:55 +0000 | [diff] [blame] | 64 | For the most part this should be self descriptive. It is printed in a |
| 65 | sort-of human-readable way for easy understanding. You may want to |
| 66 | read the rest of this together with the results of "valgrind --xml=yes |
| 67 | memcheck/tests/xml1" as an example. |
sewardj | 9829e38 | 2005-05-24 14:17:41 +0000 | [diff] [blame] | 68 | |
| 69 | All tags are balanced: a <foo> tag is always closed by </foo>. Hence |
| 70 | in the description that follows, mention of a tag <foo> implicitly |
| 71 | means there is a matching closing tag </foo>. |
| 72 | |
| 73 | Symbols in CAPITALS are nonterminals in the grammar and are defined |
| 74 | somewhere below. The root nonterminal is TOPLEVEL. |
| 75 | |
| 76 | The following nonterminals are not described further: |
| 77 | INT is a 64-bit signed decimal integer. |
| 78 | TEXT is arbitrary text. |
sewardj | 9e7212f | 2005-05-24 15:00:55 +0000 | [diff] [blame] | 79 | HEX64 is a 64-bit hexadecimal number, with leading "0x". |
sewardj | 9829e38 | 2005-05-24 14:17:41 +0000 | [diff] [blame] | 80 | |
sewardj | 57d99c5 | 2005-06-13 16:44:33 +0000 | [diff] [blame] | 81 | Text strings are escaped so as to remove the <, > and & characters |
| 82 | which would otherwise mess up parsing. They are replaced respectively |
| 83 | with the standard encodings "<", ">" and "&" respectively. |
| 84 | Note this is not (yet) done throughout, only for function names in |
| 85 | <frame>..</frame> tags-pairs. |
| 86 | |
sewardj | 9829e38 | 2005-05-24 14:17:41 +0000 | [diff] [blame] | 87 | |
| 88 | TOPLEVEL |
| 89 | -------- |
sewardj | 57d99c5 | 2005-06-13 16:44:33 +0000 | [diff] [blame] | 90 | |
| 91 | The first line output is always this: |
| 92 | |
| 93 | <?xml version="1.0"?> |
| 94 | |
| 95 | All remaining output is contained within the tag-pair |
| 96 | <valgrindoutput>. |
sewardj | 9829e38 | 2005-05-24 14:17:41 +0000 | [diff] [blame] | 97 | |
| 98 | Inside that, the first entity is an indication of the protocol |
| 99 | version. This is provided so that existing parsers can identify XML |
| 100 | created by future versions of Valgrind merely by observing that the |
de | e6ca7bd | 2005-08-03 18:58:45 +0000 | [diff] [blame^] | 101 | protocol version is one they don`t understand. Hence TOPLEVEL is: |
sewardj | 9829e38 | 2005-05-24 14:17:41 +0000 | [diff] [blame] | 102 | |
sewardj | 8665d8e | 2005-06-01 17:35:23 +0000 | [diff] [blame] | 103 | <?xml version="1.0"?> |
sewardj | 9829e38 | 2005-05-24 14:17:41 +0000 | [diff] [blame] | 104 | <valgrindoutput> |
| 105 | <protocolversion>INT<protocolversion> |
| 106 | VERSION1STUFF |
| 107 | </valgrindoutput> |
| 108 | |
| 109 | The only currently defined protocol version number is 1. This |
| 110 | document only defines protocol version 1. |
| 111 | |
| 112 | |
| 113 | VERSION1STUFF |
| 114 | ------------- |
| 115 | This is the main top-level construction. Roughly speaking, it |
| 116 | contains a load of preamble, the errors from the run of the |
| 117 | program, and the result of the final leak check. Hence the |
| 118 | following in sequence: |
| 119 | |
| 120 | * Various preamble lines which give version info for the various |
| 121 | components. The text in them can be anything; it is not intended |
| 122 | for interpretation by the GUI: |
| 123 | |
sewardj | 57d99c5 | 2005-06-13 16:44:33 +0000 | [diff] [blame] | 124 | <preamble> |
| 125 | <line>Misc version/copyright text</line> (zero or more of) |
| 126 | </preamble> |
sewardj | 9829e38 | 2005-05-24 14:17:41 +0000 | [diff] [blame] | 127 | |
| 128 | * The PID of this process and of its parent: |
| 129 | |
| 130 | <pid>INT</pid> |
| 131 | <ppid>INT</ppid> |
| 132 | |
| 133 | * The name of the tool being used: |
| 134 | |
| 135 | <tool>TEXT</tool> |
| 136 | |
sewardj | ad31116 | 2005-07-19 11:25:02 +0000 | [diff] [blame] | 137 | * OPTIONALLY, if --log-file-qualifier=VAR flag was given: |
| 138 | |
| 139 | <logfilequalifier> <var>VAR</var> <value>$VAR</value> |
| 140 | </logfilequalifier> |
| 141 | |
| 142 | That is, both the name of the environment variable and its value |
| 143 | are given. |
| 144 | |
sewardj | e5e1f82 | 2005-07-19 14:59:41 +0000 | [diff] [blame] | 145 | * OPTIONALLY, if --xml-user-comment=STRING was given: |
| 146 | |
| 147 | <usercomment>STRING</usercomment> |
| 148 | |
| 149 | STRING is not escaped in any way, so that it itself may be a piece |
| 150 | of XML with arbitrary tags etc. |
| 151 | |
sewardj | b8a3dac | 2005-07-19 12:39:11 +0000 | [diff] [blame] | 152 | * The program and args: first those pertaining to Valgrind itself, and |
| 153 | then those pertaining to the program to be run under Valgrind (the |
| 154 | client): |
sewardj | 9829e38 | 2005-05-24 14:17:41 +0000 | [diff] [blame] | 155 | |
sewardj | b8a3dac | 2005-07-19 12:39:11 +0000 | [diff] [blame] | 156 | <args> |
| 157 | <vargv> |
| 158 | <exe>TEXT</exe> |
| 159 | <arg>TEXT</arg> (zero or more of) |
| 160 | </vargv> |
| 161 | <argv> |
| 162 | <exe>TEXT</exe> |
| 163 | <arg>TEXT</arg> (zero or more of) |
| 164 | </argv> |
| 165 | </args> |
sewardj | 9829e38 | 2005-05-24 14:17:41 +0000 | [diff] [blame] | 166 | |
| 167 | * The following, indicating that the program has now started: |
| 168 | |
sewardj | 33e6042 | 2005-07-24 07:33:15 +0000 | [diff] [blame] | 169 | <status> <state>RUNNING</state> |
| 170 | <time>human-readable-time-string</time> |
sewardj | 68cde6f | 2005-07-19 12:17:51 +0000 | [diff] [blame] | 171 | </status> |
sewardj | 9829e38 | 2005-05-24 14:17:41 +0000 | [diff] [blame] | 172 | |
| 173 | * Zero or more of (either ERROR or ERRORCOUNTS). |
| 174 | |
| 175 | * The following, indicating that the program has now finished, and |
| 176 | that the wrapup (leak checking) is happening. |
| 177 | |
sewardj | 33e6042 | 2005-07-24 07:33:15 +0000 | [diff] [blame] | 178 | <status> <state>FINISHED</state> |
| 179 | <time>human-readable-time-string</time> |
sewardj | 68cde6f | 2005-07-19 12:17:51 +0000 | [diff] [blame] | 180 | </status> |
sewardj | 9829e38 | 2005-05-24 14:17:41 +0000 | [diff] [blame] | 181 | |
| 182 | * SUPPCOUNTS, indicating how many times each suppression was used. |
| 183 | |
| 184 | * Zero or more ERRORs, each of which is a complaint from the |
| 185 | leak checker. |
| 186 | |
de | e6ca7bd | 2005-08-03 18:58:45 +0000 | [diff] [blame^] | 187 | That`s it. |
sewardj | 9829e38 | 2005-05-24 14:17:41 +0000 | [diff] [blame] | 188 | |
| 189 | |
| 190 | ERROR |
| 191 | ----- |
| 192 | This shows an error, and is the most complex nonterminal. The format |
| 193 | is as follows: |
| 194 | |
| 195 | <error> |
| 196 | <unique>HEX64</unique> |
| 197 | <tid>INT</tid> |
| 198 | <kind>KIND</kind> |
| 199 | <what>TEXT</what> |
| 200 | |
| 201 | optionally: <leakedbytes>INT</leakedbytes> |
| 202 | optionally: <leakedblocks>INT</leakedblocks> |
| 203 | |
| 204 | STACK |
| 205 | |
| 206 | optionally: <auxwhat>TEXT</auxwhat> |
| 207 | optionally: STACK |
| 208 | |
| 209 | </error> |
| 210 | |
| 211 | * Each error contains a unique, arbitrary 64-bit hex number. This is |
| 212 | used to refer to the error in ERRORCOUNTS nonterminals (see below). |
| 213 | |
| 214 | * The <tid> tag indicates the Valgrind thread number. This value |
| 215 | is arbitrary but may be used to determine which threads produced |
| 216 | which errors (at least, the first instance of each error). |
| 217 | |
| 218 | * The <kind> tag specifies one of a small number of fixed error |
| 219 | types (enumerated below), so that GUIs may roughly categorise |
| 220 | errors by type if they want. |
| 221 | |
| 222 | * The <what> tag gives a human-understandable description of the |
| 223 | error. |
| 224 | |
| 225 | * For <kind> tags specifying a KIND of the form "Leak_*", the |
| 226 | optional <leakedbytes> and <leakedblocks> indicate the number of |
| 227 | bytes and blocks leaked by this error. |
| 228 | |
| 229 | * The primary STACK for this error, indicating where it occurred. |
| 230 | |
| 231 | * Some error types may have auxiliary information attached: |
| 232 | |
| 233 | <auxwhat>TEXT</auxwhat> gives an auxiliary human-readable |
| 234 | description (usually of invalid addresses) |
| 235 | |
| 236 | STACK gives an auxiliary stack (usually the allocation/free |
| 237 | point of a block). If this STACK is present then |
| 238 | <auxwhat>TEXT</auxwhat> will precede it. |
| 239 | |
| 240 | |
| 241 | KIND |
| 242 | ---- |
| 243 | This is a small enumeration indicating roughly the nature of an error. |
| 244 | The possible values are: |
| 245 | |
| 246 | InvalidFree |
| 247 | |
| 248 | free/delete/delete[] on an invalid pointer |
| 249 | |
| 250 | MismatchedFree |
| 251 | |
| 252 | free/delete/delete[] does not match allocation function |
| 253 | (eg doing new[] then free on the result) |
| 254 | |
| 255 | InvalidRead |
| 256 | |
| 257 | read of an invalid address |
| 258 | |
| 259 | InvalidWrite |
| 260 | |
| 261 | write of an invalid address |
| 262 | |
| 263 | InvalidJump |
| 264 | |
| 265 | jump to an invalid address |
| 266 | |
| 267 | Overlap |
| 268 | |
| 269 | args overlap other otherwise bogus in eg memcpy |
| 270 | |
| 271 | InvalidMemPool |
| 272 | |
| 273 | invalid mem pool specified in client request |
| 274 | |
| 275 | UninitCondition |
| 276 | |
| 277 | conditional jump/move depends on undefined value |
| 278 | |
| 279 | UninitValue |
| 280 | |
| 281 | other use of undefined value (primarily memory addresses) |
| 282 | |
| 283 | SyscallParam |
| 284 | |
| 285 | system call params are undefined or point to |
| 286 | undefined/unaddressible memory |
| 287 | |
| 288 | ClientCheck |
| 289 | |
| 290 | "error" resulting from a client check request |
| 291 | |
| 292 | Leak_DefinitelyLost |
| 293 | |
| 294 | memory leak; the referenced blocks are definitely lost |
| 295 | |
| 296 | Leak_IndirectlyLost |
| 297 | |
| 298 | memory leak; the referenced blocks are lost because all pointers |
| 299 | to them are also in leaked blocks |
| 300 | |
| 301 | Leak_PossiblyLost |
| 302 | |
| 303 | memory leak; only interior pointers to referenced blocks were |
| 304 | found |
| 305 | |
| 306 | Leak_StillReachable |
| 307 | |
| 308 | memory leak; pointers to un-freed blocks are still available |
| 309 | |
| 310 | |
| 311 | STACK |
| 312 | ----- |
| 313 | STACK indicates locations in the program being debugged. A STACK |
| 314 | is one or more FRAMEs. The first is the innermost frame, the |
| 315 | next its caller, etc. |
| 316 | |
| 317 | <stack> |
| 318 | one or more FRAME |
| 319 | </stack> |
| 320 | |
| 321 | |
| 322 | FRAME |
| 323 | ----- |
| 324 | FRAME records a single program location: |
| 325 | |
| 326 | <frame> |
| 327 | <ip>HEX64</ip> |
| 328 | optionally <obj>TEXT</obj> |
| 329 | optionally <fn>TEXT</fn> |
sewardj | 57d99c5 | 2005-06-13 16:44:33 +0000 | [diff] [blame] | 330 | optionally <dir>TEXT</dir> |
sewardj | 9829e38 | 2005-05-24 14:17:41 +0000 | [diff] [blame] | 331 | optionally <file>TEXT</file> |
| 332 | optionally <line>INT</line> |
| 333 | </frame> |
| 334 | |
| 335 | Only the <ip> field is guaranteed to be present. It indicates a |
| 336 | code ("instruction pointer") address. |
| 337 | |
| 338 | The optional fields, if present, appear in the order stated: |
| 339 | |
| 340 | * obj: gives the name of the ELF object containing the code address |
| 341 | |
| 342 | * fn: gives the name of the function containing the code address |
| 343 | |
sewardj | 57d99c5 | 2005-06-13 16:44:33 +0000 | [diff] [blame] | 344 | * dir: gives the source directory associated with the name specified |
| 345 | by <file>. Note the current implementation often does not |
| 346 | put anything useful in this field. |
| 347 | |
sewardj | 9829e38 | 2005-05-24 14:17:41 +0000 | [diff] [blame] | 348 | * file: gives the name of the source file containing the code address |
| 349 | |
| 350 | * line: gives the line number in the source file |
| 351 | |
| 352 | |
| 353 | ERRORCOUNTS |
| 354 | ----------- |
| 355 | This specifies, for each error that has been so far presented, |
| 356 | the number of occurrences of that error. |
| 357 | |
| 358 | <errorcounts> |
| 359 | zero or more of |
| 360 | <pair> <count>INT</count> <unique>HEX64</unique> </pair> |
| 361 | </errorcounts> |
| 362 | |
| 363 | Each <pair> gives the current error count <count> for the error with |
| 364 | unique tag </unique>. The counts do not have to give a count for each |
| 365 | error so far presented - partial information is allowable. |
| 366 | |
sewardj | 9e7212f | 2005-05-24 15:00:55 +0000 | [diff] [blame] | 367 | As at Valgrind rev 3793, error counts are only emitted at program |
sewardj | 9829e38 | 2005-05-24 14:17:41 +0000 | [diff] [blame] | 368 | termination. However, it is perfectly acceptable to periodically emit |
| 369 | error counts as the program is running. Doing so would facilitate a |
| 370 | GUI to dynamically update its error-count display as the program runs. |
| 371 | |
| 372 | |
| 373 | SUPPCOUNTS |
| 374 | ---------- |
| 375 | A SUPPCOUNTS block appears exactly once, after the program terminates. |
| 376 | It specifies the number of times each error-suppression was used. |
| 377 | Suppressions not mentioned were used zero times. |
| 378 | |
| 379 | <suppcounts> |
| 380 | zero or more of |
sewardj | 7c9e57c | 2005-05-24 14:21:45 +0000 | [diff] [blame] | 381 | <pair> <count>INT</count> <name>TEXT</name> </pair> |
sewardj | 9829e38 | 2005-05-24 14:17:41 +0000 | [diff] [blame] | 382 | </suppcounts> |
| 383 | |
| 384 | The <name> is as specified in the suppression name fields in .supp |
| 385 | files. |
sewardj | 57d99c5 | 2005-06-13 16:44:33 +0000 | [diff] [blame] | 386 | |