blob: f997aa8e447a9bcdeaf2c957337b6a71c4621ddf [file] [log] [blame]
weidendoaf0e7232006-03-20 10:29:30 +00001<?xml version="1.0"?> <!-- -*- sgml -*- -->
2<!DOCTYPE chapter PUBLIC "-//OASIS//DTD DocBook XML V4.2//EN"
3 "http://www.oasis-open.org/docbook/xml/4.2/docbookx.dtd"
4[ <!ENTITY % cl-entities SYSTEM "cl-entities.xml"> %cl-entities; ]>
5
6<chapter id="cl-format" xreflabel="Callgrind Format Specification">
7<title>Callgrind Format Specification</title>
8
9<para>This chapter describes the Callgrind Profile Format, Version 1.</para>
10
11<para>A synonymous name is "Calltree Profile Format". These names actually mean
12the same since Callgrind was previously named Calltree.</para>
13
14<para>The format description is meant for the user to be able to understand the
15file contents; but more important, it is given for authors of measurement or
16visualization tools to be able to write and read this format.</para>
17
18<sect1 id="cl-format.overview" xreflabel="Overview">
19<title>Overview</title>
20
21<para>The profile data format is ASCII based.
22It is written by Callgrind, and it is upwards compatible
23to the format used by Cachegrind (ie. Cachegrind uses a subset). It can
24be read by callgrind_annotate and KCachegrind.</para>
25
26<para>This chapter gives on overview of format features and examples.
27For detailed syntax, look at the format reference.</para>
28
29<sect2 id="cl-format.overview.basics" xreflabel="Basic Structure">
30<title>Basic Structure</title>
31
32<para>Each file has a header part of an arbitrary number of lines of the
33format "key: value". The lines with key "positions" and "events" define
34the meaning of cost lines in the second part of the file: the value of
35"positions" is a list of subpositions, and the value of "events" is a list
36of event type names. Cost lines consist of subpositions followed by 64-bit
37counters for the events, in the order specified by the "positions" and "events"
38header line.</para>
39
40<para>The "events" header line is always required in contrast to the optional
41line for "positions", which defaults to "line", i.e. a line number of some
42source file. In addition, the second part of the file contains position
43specifications of the form "spec=name". "spec" can be e.g. "fn" for a
44function name or "fl" for a file name. Cost lines are always related to
45the function/file specifications given directly before.</para>
46
47</sect2>
48
49<sect2 id="cl-format.overview.example1" xreflabel="Simple Example">
50<title>Simple Example</title>
51
weidendo362f7832007-05-24 18:04:42 +000052<para>The event names in the following example are quite arbitrary, and are not
53related to event names used by Callgrind. Especially, cycle counts matching
54real processors probably will never be generated by any Valgrind tools, as these
55are bound to simulations of simple machine models for acceptable slowdown.
56However, any profiling tool could use the format described in this chapter.</para>
57
weidendoaf0e7232006-03-20 10:29:30 +000058<para>
59<screen>events: Cycles Instructions Flops
60fl=file.f
61fn=main
6215 90 14 2
6316 20 12</screen></para>
64
65<para>The above example gives profile information for event types "Cycles",
66"Instructions", and "Flops". Thus, cost lines give the number of CPU cycles
67passed by, number of executed instructions, and number of floating point
68operations executed while running code corresponding to some source
69position. As there is no line specifying the value of "positions", it defaults
70to "line", which means that the first number of a cost line is always a line
71number.</para>
72
73<para>Thus, the first cost line specifies that in line 15 of source file
74"file.f" there is code belonging to function "main". While running, 90 CPU
75cycles passed by, and 2 of the 14 instructions executed were floating point
sewardj33878892007-11-17 09:43:25 +000076operations. Similarly, the next line specifies that there were 12 instructions
weidendoaf0e7232006-03-20 10:29:30 +000077executed in the context of function "main" which can be related to line 16 in
78file "file.f", taking 20 CPU cycles. If a cost line specifies less event counts
79than given in the "events" line, the rest is assumed to be zero. I.e., there
80was no floating point instruction executed relating to line 16.</para>
81
82<para>Note that regular cost lines always give self (also called exclusive)
83cost of code at a given position. If you specify multiple cost lines for the
84same position, these will be summed up. On the other hand, in the example above
85there is no specification of how many times function "main" actually was
86called: profile data only contains sums.</para>
87
88</sect2>
89
90
91<sect2 id="cl-format.overview.associations" xreflabel="Associations">
92<title>Associations</title>
93
94<para>The most important extension to the original format of Cachegrind is the
95ability to specify call relationship among functions. More generally, you
sewardj33878892007-11-17 09:43:25 +000096specify associations among positions. For this, the second part of the
97file also can contain association specifications. These look similar to
weidendoaf0e7232006-03-20 10:29:30 +000098position specifications, but consist of 2 lines. For calls, the format
99looks like
100<screen>
101 calls=(Call Count) (Destination position)
102 (Source position) (Inclusive cost of call)
103</screen></para>
104
105<para>The destination only specifies subpositions like line number. Therefore,
106to be able to specify a call to another function in another source file, you
107have to precede the above lines with a "cfn=" specification for the name of the
108called function, and a "cfl=" specification if the function is in another
109source file. The 2nd line looks like a regular cost line with the difference
110that inclusive cost spent inside of the function call has to be specified.</para>
111
sewardj33878892007-11-17 09:43:25 +0000112<para>Other associations which or for example (conditional) jumps. See the
weidendoaf0e7232006-03-20 10:29:30 +0000113reference below for details.</para>
114
115</sect2>
116
117
118<sect2 id="cl-format.overview.example2" xreflabel="Extended Example">
119<title>Extended Example</title>
120
121<para>The following example shows 3 functions, "main", "func1", and
122"func2". Function "main" calls "func1" once and "func2" 3 times. "func1" calls
123"func2" 2 times.
124<screen>events: Instructions
125
126fl=file1.c
127fn=main
12816 20
129cfn=func1
130calls=1 50
13116 400
132cfl=file2.c
133cfn=func2
134calls=3 20
13516 400
136
137fn=func1
13851 100
139cfl=file2.c
140cfn=func2
141calls=2 20
14251 300
143
144fl=file2.c
145fn=func2
14620 700</screen></para>
147
148<para>One can see that in "main" only code from line 16 is executed where also
weidendo52d4d1a2007-08-28 21:52:45 +0000149the other functions are called. Inclusive cost of "main" is 820, which is the
150sum of self cost 20 and costs spent in the calls: 400 for the single call to
151"func1" and 400 as sum for the three calls to "func2".</para>
weidendoaf0e7232006-03-20 10:29:30 +0000152
153<para>Function "func1" is located in "file1.c", the same as "main". Therefore,
154a "cfl=" specification for the call to "func1" is not needed. The function
155"func1" only consists of code at line 51 of "file1.c", where "func2" is called.</para>
156
157</sect2>
158
159
160<sect2 id="cl-format.overview.compression1" xreflabel="Name Compression">
161<title>Name Compression</title>
162
163<para>With the introduction of association specifications like calls it is
164needed to specify the same function or same file name multiple times. As
165absolute filenames or symbol names in C++ can be quite long, it is advantageous
weidendo52d4d1a2007-08-28 21:52:45 +0000166to be able to specify integer IDs for position specifications.
167Here, the term "position" corresponds to a file name (source or object file)
168or function name.</para>
weidendoaf0e7232006-03-20 10:29:30 +0000169
170<para>To support name compression, a position specification can be not only of
171the format "spec=name", but also "spec=(ID) name" to specify a mapping of an
172integer ID to a name, and "spec=(ID)" to reference a previously defined ID
173mapping. There is a separate ID mapping for each position specification,
174i.e. you can use ID 1 for both a file name and a symbol name.</para>
175
176<para>With string compression, the example from 1.4 looks like this:
177<screen>events: Instructions
178
179fl=(1) file1.c
180fn=(1) main
18116 20
182cfn=(2) func1
183calls=1 50
18416 400
185cfl=(2) file2.c
186cfn=(3) func2
187calls=3 20
18816 400
189
190fn=(2)
19151 100
192cfl=(2)
193cfn=(3)
194calls=2 20
19551 300
196
197fl=(2)
198fn=(3)
19920 700</screen></para>
200
sewardj33878892007-11-17 09:43:25 +0000201<para>As position specifications carry no information themselves, but only change
weidendoaf0e7232006-03-20 10:29:30 +0000202the meaning of subsequent cost lines or associations, they can appear
203everywhere in the file without any negative consequence. Especially, you can
204define name compression mappings directly after the header, and before any cost
205lines. Thus, the above example can also be written as
206<screen>events: Instructions
207
208# define file ID mapping
209fl=(1) file1.c
210fl=(2) file2.c
211# define function ID mapping
212fn=(1) main
213fn=(2) func1
214fn=(3) func2
215
216fl=(1)
217fn=(1)
21816 20
219...</screen></para>
220
221</sect2>
222
223
224<sect2 id="cl-format.overview.compression2" xreflabel="Subposition Compression">
225<title>Subposition Compression</title>
226
weidendo52d4d1a2007-08-28 21:52:45 +0000227<para>If a Callgrind data file should hold costs for each assembler instruction
sewardj33878892007-11-17 09:43:25 +0000228of a program, you specify subposition "instr" in the "positions:" header line,
weidendoaf0e7232006-03-20 10:29:30 +0000229and each cost line has to include the address of some instruction. Addresses
weidendo52d4d1a2007-08-28 21:52:45 +0000230are allowed to have a size of 64bit to support 64bit architectures. Thus,
231repeating similar, long addresses for almost every line in the data file can
232enlarge the file size quite significantly, and
weidendoaf0e7232006-03-20 10:29:30 +0000233motivates for subposition compression: instead of every cost line starting with
weidendo52d4d1a2007-08-28 21:52:45 +0000234a 16 character long address, one is allowed to specify relative addresses.
235This relative specification is not only allowed for instruction addresses, but
236also for line numbers; both addresses and line numbers are called "subpositions".</para>
weidendoaf0e7232006-03-20 10:29:30 +0000237
238<para>A relative subposition always is based on the corresponding subposition
239of the last cost line, and starts with a "+" to specify a positive difference,
240a "-" to specify a negative difference, or consists of "*" to specify the same
weidendo52d4d1a2007-08-28 21:52:45 +0000241subposition. Because absolute subpositions always are positive (ie. never
sewardj33878892007-11-17 09:43:25 +0000242prefixed by "-"), any relative specification is non-ambiguous; additionally,
weidendo52d4d1a2007-08-28 21:52:45 +0000243absolute and relative subposition specifications can be mixed freely.
244Assume the following example (subpositions can always be specified
weidendoaf0e7232006-03-20 10:29:30 +0000245as hexadecimal numbers, beginning with "0x"):
246<screen>positions: instr line
247events: ticks
248
249fn=func
2500x80001234 90 1
2510x80001237 90 5
2520x80001238 91 6</screen></para>
253
254<para>With subposition compression, this looks like
255<screen>positions: instr line
256events: ticks
257
258fn=func
2590x80001234 90 1
260+3 * 5
261+1 +1 6</screen></para>
262
263<para>Remark: For assembler annotation to work, instruction addresses have to
264be corrected to correspond to addresses found in the original binary. I.e. for
265relocatable shared objects, often a load offset has to be subtracted.</para>
266
267</sect2>
268
269
270<sect2 id="cl-format.overview.misc" xreflabel="Miscellaneous">
271<title>Miscellaneous</title>
272
273<sect3 id="cl-format.overview.misc.summary" xreflabel="Cost Summary Information">
274<title>Cost Summary Information</title>
275
276<para>For the visualization to be able to show cost percentage, a sum of the
277cost of the full run has to be known. Usually, it is assumed that this is the
278sum of all cost lines in a file. But sometimes, this is not correct. Thus, you
279can specify a "summary:" line in the header giving the full cost for the
280profile run. This has another effect: a import filter can show a progress bar
281while loading a large data file if he knows to cost sum in advance.</para>
282
283</sect3>
284
285<sect3 id="cl-format.overview.misc.events" xreflabel="Long Names for Event Types and inherited Types">
286<title>Long Names for Event Types and inherited Types</title>
287
288<para>Event types for cost lines are specified in the "events:" line with an
289abbreviated name. For visualization, it makes sense to be able to specify some
290longer, more descriptive name. For an event type "Ir" which means "Instruction
291Fetches", this can be specified the header line
292<screen>event: Ir : Instruction Fetches
293events: Ir Dr</screen></para>
294
sewardj33878892007-11-17 09:43:25 +0000295<para>In this example, "Dr" itself has no long name associated. The order of
weidendoaf0e7232006-03-20 10:29:30 +0000296"event:" lines and the "events:" line is of no importance. Additionally,
297inherited event types can be introduced for which no raw data is available, but
298which are calculated from given types. Suppose the last example, you could add
299<screen>event: Sum = Ir + Dr</screen>
300to specify an additional event type "Sum", which is calculated by adding costs
301for "Ir and "Dr".</para>
302
303</sect3>
304
305</sect2>
306
307</sect1>
308
309<sect1 id="cl-format.reference" xreflabel="Reference">
310<title>Reference</title>
311
312<sect2 id="cl-format.reference.grammar" xreflabel="Grammar">
313<title>Grammar</title>
314
315<para>
316<screen>ProfileDataFile := FormatVersion? Creator? PartData*</screen>
317<screen>FormatVersion := "version:" Space* Number "\n"</screen>
318<screen>Creator := "creator:" NoNewLineChar* "\n"</screen>
319<screen>PartData := (HeaderLine "\n")+ (BodyLine "\n")+</screen>
320<screen>HeaderLine := (empty line)
321 | ('#' NoNewLineChar*)
322 | PartDetail
323 | Description
324 | EventSpecification
325 | CostLineDef</screen>
326<screen>PartDetail := TargetCommand | TargetID</screen>
327<screen>TargetCommand := "cmd:" Space* NoNewLineChar*</screen>
328<screen>TargetID := ("pid"|"thread"|"part") ":" Space* Number</screen>
329<screen>Description := "desc:" Space* Name Space* ":" NoNewLineChar*</screen>
330<screen>EventSpecification := "event:" Space* Name InheritedDef? LongNameDef?</screen>
331<screen>InheritedDef := "=" InheritedExpr</screen>
332<screen>InheritedExpr := Name
333 | Number Space* ("*" Space*)? Name
334 | InheritedExpr Space* "+" Space* InheritedExpr</screen>
335<screen>LongNameDef := ":" NoNewLineChar*</screen>
336<screen>CostLineDef := "events:" Space* Name (Space+ Name)*
337 | "positions:" "instr"? (Space+ "line")?</screen>
338<screen>BodyLine := (empty line)
339 | ('#' NoNewLineChar*)
340 | CostLine
341 | PositionSpecification
sewardj33878892007-11-17 09:43:25 +0000342 | AssociationSpecification</screen>
weidendoaf0e7232006-03-20 10:29:30 +0000343<screen>CostLine := SubPositionList Costs?</screen>
344<screen>SubPositionList := (SubPosition+ Space+)+</screen>
345<screen>SubPosition := Number | "+" Number | "-" Number | "*"</screen>
346<screen>Costs := (Number Space+)+</screen>
347<screen>PositionSpecification := Position "=" Space* PositionName</screen>
348<screen>Position := CostPosition | CalledPosition</screen>
349<screen>CostPosition := "ob" | "fl" | "fi" | "fe" | "fn"</screen>
350<screen>CalledPosition := " "cob" | "cfl" | "cfn"</screen>
351<screen>PositionName := ( "(" Number ")" )? (Space* NoNewLineChar* )?</screen>
sewardj33878892007-11-17 09:43:25 +0000352<screen>AssociationSpecification := CallSpecification
weidendoaf0e7232006-03-20 10:29:30 +0000353 | JumpSpecification</screen>
354<screen>CallSpecification := CallLine "\n" CostLine</screen>
355<screen>CallLine := "calls=" Space* Number Space+ SubPositionList</screen>
356<screen>JumpSpecification := ...</screen>
357<screen>Space := " " | "\t"</screen>
358<screen>Number := HexNumber | (Digit)+</screen>
359<screen>Digit := "0" | ... | "9"</screen>
360<screen>HexNumber := "0x" (Digit | HexChar)+</screen>
361<screen>HexChar := "a" | ... | "f" | "A" | ... | "F"</screen>
362<screen>Name = Alpha (Digit | Alpha)*</screen>
363<screen>Alpha = "a" | ... | "z" | "A" | ... | "Z"</screen>
364<screen>NoNewLineChar := all characters without "\n"</screen>
365</para>
366
367</sect2>
368
369<sect2 id="cl-format.reference.header" xreflabel="Description of Header Lines">
370<title>Description of Header Lines</title>
371
372<para>The header has an arbitrary number of lines of the format
373"key: value". Possible <emphasis>key</emphasis> values for the header are:</para>
374
375<itemizedlist>
376
377 <listitem>
378 <para><computeroutput>version: number</computeroutput> [Callgrind]</para>
379 <para>This is used to distinguish future profile data formats. A
380 major version of 0 or 1 is supposed to be upwards compatible with
381 Cachegrinds format. It is optional; if not appearing, version 1
382 is supposed. Otherwise, this has to be the first header line.</para>
383 </listitem>
384
385 <listitem>
386 <para><computeroutput>pid: process id</computeroutput> [Callgrind]</para>
387 <para>This specifies the process ID of the supervised application
388 for which this profile was generated.</para>
389 </listitem>
390
391 <listitem>
392 <para><computeroutput>cmd: program name + args</computeroutput> [Cachegrind]</para>
393 <para>This specifies the full command line of the supervised
394 application for which this profile was generated.</para>
395 </listitem>
396
397 <listitem>
398 <para><computeroutput>part: number</computeroutput> [Callgrind]</para>
399 <para>This specifies a sequentially incremented number for each dump
400 generated, starting at 1.</para>
401 </listitem>
402
403 <listitem>
404 <para><computeroutput>desc: type: value</computeroutput> [Cachegrind]</para>
405 <para>This specifies various information for this dump. For some
406 types, the semantic is defined, but any description type is allowed.
407 Unknown types should be ignored.</para>
408 <para>There are the types "I1 cache", "D1 cache", "L2 cache", which
409 specify parameters used for the cache simulator. These are the only
410 types originally used by Cachegrind. Additionally, Callgrind uses
411 the following types: "Timerange" gives a rough range of the basic
412 block counter, for which the cost of this dump was collected.
413 Type "Trigger" states the reason of why this trace was generated.
414 E.g. program termination or forced interactive dump.</para>
415 </listitem>
416
417 <listitem>
418 <para><computeroutput>positions: [instr] [line]</computeroutput> [Callgrind]</para>
419 <para>For cost lines, this defines the semantic of the first numbers.
420 Any combination of "instr", "bb" and "line" is allowed, but has to be
421 in this order which corresponds to position numbers at the start of
422 the cost lines later in the file.</para>
423 <para>If "instr" is specified, the position is the address of an
424 instruction whose execution raised the events given later on the
425 line. This address is relative to the offset of the binary/shared
426 library file to not have to specify relocation info. For "line",
427 the position is the line number of a source file, which is
428 responsible for the events raised. Note that the mapping of "instr"
429 and "line" positions are given by the debugging line information
430 produced by the compiler.</para>
431 <para>This field is optional. If not specified, "line" is supposed
432 only.</para>
433 </listitem>
434
435 <listitem>
sewardj33878892007-11-17 09:43:25 +0000436 <para><computeroutput>events: event type abbreviations</computeroutput> [Cachegrind]</para>
weidendoaf0e7232006-03-20 10:29:30 +0000437 <para>A list of short names of the event types logged in this file.
438 The order is the same as in cost lines. The first event type is the
439 second or third number in a cost line, depending on the value of
440 "positions". Callgrind does not add additional cost types. Specify
441 exactly once.</para>
442 <para>Cost types from original Cachegrind are:
443 <itemizedlist>
444 <listitem>
445 <para><command>Ir</command>: Instruction read access</para>
446 </listitem>
447 <listitem>
448 <para><command>I1mr</command>: Instruction Level 1 read cache miss</para>
449 </listitem>
450 <listitem>
451 <para><command>I2mr</command>: Instruction Level 2 read cache miss</para>
452 </listitem>
453 <listitem>
454 <para>...</para>
455 </listitem>
456 </itemizedlist>
457 </para>
458 </listitem>
459
460 <listitem>
461 <para><computeroutput>summary: costs</computeroutput> [Callgrind]</para>
462 <para><computeroutput>totals: costs</computeroutput> [Cachegrind]</para>
463 <para>The value or the total number of events covered by this trace
464 file. Both keys have the same meaning, but the "totals:" line
465 happens to be at the end of the file, while "summary:" appears in
466 the header. This was added to allow postprocessing tools to know
467 in advance to total cost. The two lines always give the same cost
468 counts.</para>
469 </listitem>
470
471</itemizedlist>
472
473</sect2>
474
475<sect2 id="cl-format.reference.body" xreflabel="Description of Body Lines">
476<title>Description of Body Lines</title>
477
478<para>There exist lines
479<computeroutput>spec=position</computeroutput>. The values for position
480specifications are arbitrary strings. When starting with "(" and a
481digit, it's a string in compressed format. Otherwise it's the real
482position string. This allows for file and symbol names as position
483strings, as these never start with "(" + <emphasis>digit</emphasis>.
484The compressed format is either "(" <emphasis>number</emphasis> ")"
485<emphasis>space</emphasis> <emphasis>position</emphasis> or only
486"(" <emphasis>number</emphasis> ")". The first relates
487<emphasis>position</emphasis> to <emphasis>number</emphasis> in the
488context of the given format specification from this line to the end of
489the file; it makes the (<emphasis>number</emphasis>) an alias for
490<emphasis>position</emphasis>. Compressed format is always
491optional.</para>
492
493<para>Position specifications allowed:</para>
494<itemizedlist>
495
496 <listitem>
497 <para><computeroutput>ob=</computeroutput> [Callgrind]</para>
498 <para>The ELF object where the cost of next cost lines happens.</para>
499 </listitem>
500
501 <listitem>
502 <para><computeroutput>fl=</computeroutput> [Cachegrind]</para>
503 </listitem>
504
505 <listitem>
506 <para><computeroutput>fi=</computeroutput> [Cachegrind]</para>
507 </listitem>
508
509 <listitem>
510 <para><computeroutput>fe=</computeroutput> [Cachegrind]</para>
511 <para>The source file including the code which is responsible for
512 the cost of next cost lines. "fi="/"fe=" is used when the source
513 file changes inside of a function, i.e. for inlined code.</para>
514 </listitem>
515
516 <listitem>
517 <para><computeroutput>fn=</computeroutput> [Cachegrind]</para>
518 <para>The name of the function where the cost of next cost lines
519 happens.</para>
520 </listitem>
521
522 <listitem>
523 <para><computeroutput>cob=</computeroutput> [Callgrind]</para>
524 <para>The ELF object of the target of the next call cost lines.</para>
525 </listitem>
526
527 <listitem>
528 <para><computeroutput>cfl=</computeroutput> [Callgrind]</para>
529 <para>The source file including the code of the target of the
530 next call cost lines.</para>
531 </listitem>
532
533 <listitem>
534 <para><computeroutput>cfn=</computeroutput> [Callgrind]</para>
535 <para>The name of the target function of the next call cost
536 lines.</para>
537 </listitem>
538
539 <listitem>
540 <para><computeroutput>calls=</computeroutput> [Callgrind]</para>
541 <para>The number of nonrecursive calls which are responsible for the
542 cost specified by the next call cost line. This is the cost spent
543 inside of the called function.</para>
544 <para>After "calls=" there MUST be a cost line. This is the cost
545 spent in the called function. The first number is the source line
546 from where the call happened.</para>
547 </listitem>
548
549 <listitem>
550 <para><computeroutput>jump=count target position</computeroutput> [Callgrind]</para>
551 <para>Unconditional jump, executed count times, to the given target
552 position.</para>
553 </listitem>
554
555 <listitem>
556 <para><computeroutput>jcnd=exe.count jumpcount target position</computeroutput> [Callgrind]</para>
557 <para>Conditional jump, executed exe.count times with jumpcount
558 jumps to the given target position.</para>
559 </listitem>
560
561</itemizedlist>
562
563</sect2>
564
565</sect1>
566
weidendo362f7832007-05-24 18:04:42 +0000567</chapter>