blob: 41b2d236bdc8460583a8874514d2918777930de5 [file] [log] [blame]
weidendoaf0e7232006-03-20 10:29:30 +00001<?xml version="1.0"?> <!-- -*- sgml -*- -->
2<!DOCTYPE chapter PUBLIC "-//OASIS//DTD DocBook XML V4.2//EN"
3 "http://www.oasis-open.org/docbook/xml/4.2/docbookx.dtd"
4[ <!ENTITY % cl-entities SYSTEM "cl-entities.xml"> %cl-entities; ]>
5
6<chapter id="cl-format" xreflabel="Callgrind Format Specification">
7<title>Callgrind Format Specification</title>
8
9<para>This chapter describes the Callgrind Profile Format, Version 1.</para>
10
11<para>A synonymous name is "Calltree Profile Format". These names actually mean
12the same since Callgrind was previously named Calltree.</para>
13
14<para>The format description is meant for the user to be able to understand the
15file contents; but more important, it is given for authors of measurement or
16visualization tools to be able to write and read this format.</para>
17
18<sect1 id="cl-format.overview" xreflabel="Overview">
19<title>Overview</title>
20
21<para>The profile data format is ASCII based.
22It is written by Callgrind, and it is upwards compatible
23to the format used by Cachegrind (ie. Cachegrind uses a subset). It can
24be read by callgrind_annotate and KCachegrind.</para>
25
26<para>This chapter gives on overview of format features and examples.
27For detailed syntax, look at the format reference.</para>
28
29<sect2 id="cl-format.overview.basics" xreflabel="Basic Structure">
30<title>Basic Structure</title>
31
32<para>Each file has a header part of an arbitrary number of lines of the
33format "key: value". The lines with key "positions" and "events" define
34the meaning of cost lines in the second part of the file: the value of
35"positions" is a list of subpositions, and the value of "events" is a list
36of event type names. Cost lines consist of subpositions followed by 64-bit
37counters for the events, in the order specified by the "positions" and "events"
38header line.</para>
39
40<para>The "events" header line is always required in contrast to the optional
41line for "positions", which defaults to "line", i.e. a line number of some
42source file. In addition, the second part of the file contains position
43specifications of the form "spec=name". "spec" can be e.g. "fn" for a
44function name or "fl" for a file name. Cost lines are always related to
45the function/file specifications given directly before.</para>
46
47</sect2>
48
49<sect2 id="cl-format.overview.example1" xreflabel="Simple Example">
50<title>Simple Example</title>
51
weidendo362f7832007-05-24 18:04:42 +000052<para>The event names in the following example are quite arbitrary, and are not
53related to event names used by Callgrind. Especially, cycle counts matching
54real processors probably will never be generated by any Valgrind tools, as these
55are bound to simulations of simple machine models for acceptable slowdown.
56However, any profiling tool could use the format described in this chapter.</para>
57
weidendoaf0e7232006-03-20 10:29:30 +000058<para>
59<screen>events: Cycles Instructions Flops
60fl=file.f
61fn=main
6215 90 14 2
6316 20 12</screen></para>
64
65<para>The above example gives profile information for event types "Cycles",
66"Instructions", and "Flops". Thus, cost lines give the number of CPU cycles
67passed by, number of executed instructions, and number of floating point
68operations executed while running code corresponding to some source
69position. As there is no line specifying the value of "positions", it defaults
70to "line", which means that the first number of a cost line is always a line
71number.</para>
72
73<para>Thus, the first cost line specifies that in line 15 of source file
74"file.f" there is code belonging to function "main". While running, 90 CPU
75cycles passed by, and 2 of the 14 instructions executed were floating point
76operations. Similarily, the next line specifies that there were 12 instructions
77executed in the context of function "main" which can be related to line 16 in
78file "file.f", taking 20 CPU cycles. If a cost line specifies less event counts
79than given in the "events" line, the rest is assumed to be zero. I.e., there
80was no floating point instruction executed relating to line 16.</para>
81
82<para>Note that regular cost lines always give self (also called exclusive)
83cost of code at a given position. If you specify multiple cost lines for the
84same position, these will be summed up. On the other hand, in the example above
85there is no specification of how many times function "main" actually was
86called: profile data only contains sums.</para>
87
88</sect2>
89
90
91<sect2 id="cl-format.overview.associations" xreflabel="Associations">
92<title>Associations</title>
93
94<para>The most important extension to the original format of Cachegrind is the
95ability to specify call relationship among functions. More generally, you
96specify assoziations among positions. For this, the second part of the
97file also can contain assoziation specifications. These look similar to
98position specifications, but consist of 2 lines. For calls, the format
99looks like
100<screen>
101 calls=(Call Count) (Destination position)
102 (Source position) (Inclusive cost of call)
103</screen></para>
104
105<para>The destination only specifies subpositions like line number. Therefore,
106to be able to specify a call to another function in another source file, you
107have to precede the above lines with a "cfn=" specification for the name of the
108called function, and a "cfl=" specification if the function is in another
109source file. The 2nd line looks like a regular cost line with the difference
110that inclusive cost spent inside of the function call has to be specified.</para>
111
112<para>Other assoziations which or for example (conditional) jumps. See the
113reference below for details.</para>
114
115</sect2>
116
117
118<sect2 id="cl-format.overview.example2" xreflabel="Extended Example">
119<title>Extended Example</title>
120
121<para>The following example shows 3 functions, "main", "func1", and
122"func2". Function "main" calls "func1" once and "func2" 3 times. "func1" calls
123"func2" 2 times.
124<screen>events: Instructions
125
126fl=file1.c
127fn=main
12816 20
129cfn=func1
130calls=1 50
13116 400
132cfl=file2.c
133cfn=func2
134calls=3 20
13516 400
136
137fn=func1
13851 100
139cfl=file2.c
140cfn=func2
141calls=2 20
14251 300
143
144fl=file2.c
145fn=func2
14620 700</screen></para>
147
148<para>One can see that in "main" only code from line 16 is executed where also
149the other functions are called. Inclusive cost of "main" is 420, which is the
150sum of self cost 20 and costs spent in the calls.</para>
151
152<para>Function "func1" is located in "file1.c", the same as "main". Therefore,
153a "cfl=" specification for the call to "func1" is not needed. The function
154"func1" only consists of code at line 51 of "file1.c", where "func2" is called.</para>
155
156</sect2>
157
158
159<sect2 id="cl-format.overview.compression1" xreflabel="Name Compression">
160<title>Name Compression</title>
161
162<para>With the introduction of association specifications like calls it is
163needed to specify the same function or same file name multiple times. As
164absolute filenames or symbol names in C++ can be quite long, it is advantageous
165to be able to specify integer IDs for position specifications.</para>
166
167<para>To support name compression, a position specification can be not only of
168the format "spec=name", but also "spec=(ID) name" to specify a mapping of an
169integer ID to a name, and "spec=(ID)" to reference a previously defined ID
170mapping. There is a separate ID mapping for each position specification,
171i.e. you can use ID 1 for both a file name and a symbol name.</para>
172
173<para>With string compression, the example from 1.4 looks like this:
174<screen>events: Instructions
175
176fl=(1) file1.c
177fn=(1) main
17816 20
179cfn=(2) func1
180calls=1 50
18116 400
182cfl=(2) file2.c
183cfn=(3) func2
184calls=3 20
18516 400
186
187fn=(2)
18851 100
189cfl=(2)
190cfn=(3)
191calls=2 20
19251 300
193
194fl=(2)
195fn=(3)
19620 700</screen></para>
197
198<para>As position specifications carry no information themself, but only change
199the meaning of subsequent cost lines or associations, they can appear
200everywhere in the file without any negative consequence. Especially, you can
201define name compression mappings directly after the header, and before any cost
202lines. Thus, the above example can also be written as
203<screen>events: Instructions
204
205# define file ID mapping
206fl=(1) file1.c
207fl=(2) file2.c
208# define function ID mapping
209fn=(1) main
210fn=(2) func1
211fn=(3) func2
212
213fl=(1)
214fn=(1)
21516 20
216...</screen></para>
217
218</sect2>
219
220
221<sect2 id="cl-format.overview.compression2" xreflabel="Subposition Compression">
222<title>Subposition Compression</title>
223
224<para>If a Calltree data file should hold costs for each assembler instruction
225of a program, you specify subpostion "instr" in the "positions:" header line,
226and each cost line has to include the address of some instruction. Addresses
227are allowed to have a size of 64bit to support 64bit architectures. This
228motivates for subposition compression: instead of every cost line starting with
229a 16 character long address, one is allowed to specify relative subpositions.</para>
230
231<para>A relative subposition always is based on the corresponding subposition
232of the last cost line, and starts with a "+" to specify a positive difference,
233a "-" to specify a negative difference, or consists of "*" to specify the same
234subposition. Assume the following example (subpositions can always be specified
235as hexadecimal numbers, beginning with "0x"):
236<screen>positions: instr line
237events: ticks
238
239fn=func
2400x80001234 90 1
2410x80001237 90 5
2420x80001238 91 6</screen></para>
243
244<para>With subposition compression, this looks like
245<screen>positions: instr line
246events: ticks
247
248fn=func
2490x80001234 90 1
250+3 * 5
251+1 +1 6</screen></para>
252
253<para>Remark: For assembler annotation to work, instruction addresses have to
254be corrected to correspond to addresses found in the original binary. I.e. for
255relocatable shared objects, often a load offset has to be subtracted.</para>
256
257</sect2>
258
259
260<sect2 id="cl-format.overview.misc" xreflabel="Miscellaneous">
261<title>Miscellaneous</title>
262
263<sect3 id="cl-format.overview.misc.summary" xreflabel="Cost Summary Information">
264<title>Cost Summary Information</title>
265
266<para>For the visualization to be able to show cost percentage, a sum of the
267cost of the full run has to be known. Usually, it is assumed that this is the
268sum of all cost lines in a file. But sometimes, this is not correct. Thus, you
269can specify a "summary:" line in the header giving the full cost for the
270profile run. This has another effect: a import filter can show a progress bar
271while loading a large data file if he knows to cost sum in advance.</para>
272
273</sect3>
274
275<sect3 id="cl-format.overview.misc.events" xreflabel="Long Names for Event Types and inherited Types">
276<title>Long Names for Event Types and inherited Types</title>
277
278<para>Event types for cost lines are specified in the "events:" line with an
279abbreviated name. For visualization, it makes sense to be able to specify some
280longer, more descriptive name. For an event type "Ir" which means "Instruction
281Fetches", this can be specified the header line
282<screen>event: Ir : Instruction Fetches
283events: Ir Dr</screen></para>
284
285<para>In this example, "Dr" itself has no long name assoziated. The order of
286"event:" lines and the "events:" line is of no importance. Additionally,
287inherited event types can be introduced for which no raw data is available, but
288which are calculated from given types. Suppose the last example, you could add
289<screen>event: Sum = Ir + Dr</screen>
290to specify an additional event type "Sum", which is calculated by adding costs
291for "Ir and "Dr".</para>
292
293</sect3>
294
295</sect2>
296
297</sect1>
298
299<sect1 id="cl-format.reference" xreflabel="Reference">
300<title>Reference</title>
301
302<sect2 id="cl-format.reference.grammar" xreflabel="Grammar">
303<title>Grammar</title>
304
305<para>
306<screen>ProfileDataFile := FormatVersion? Creator? PartData*</screen>
307<screen>FormatVersion := "version:" Space* Number "\n"</screen>
308<screen>Creator := "creator:" NoNewLineChar* "\n"</screen>
309<screen>PartData := (HeaderLine "\n")+ (BodyLine "\n")+</screen>
310<screen>HeaderLine := (empty line)
311 | ('#' NoNewLineChar*)
312 | PartDetail
313 | Description
314 | EventSpecification
315 | CostLineDef</screen>
316<screen>PartDetail := TargetCommand | TargetID</screen>
317<screen>TargetCommand := "cmd:" Space* NoNewLineChar*</screen>
318<screen>TargetID := ("pid"|"thread"|"part") ":" Space* Number</screen>
319<screen>Description := "desc:" Space* Name Space* ":" NoNewLineChar*</screen>
320<screen>EventSpecification := "event:" Space* Name InheritedDef? LongNameDef?</screen>
321<screen>InheritedDef := "=" InheritedExpr</screen>
322<screen>InheritedExpr := Name
323 | Number Space* ("*" Space*)? Name
324 | InheritedExpr Space* "+" Space* InheritedExpr</screen>
325<screen>LongNameDef := ":" NoNewLineChar*</screen>
326<screen>CostLineDef := "events:" Space* Name (Space+ Name)*
327 | "positions:" "instr"? (Space+ "line")?</screen>
328<screen>BodyLine := (empty line)
329 | ('#' NoNewLineChar*)
330 | CostLine
331 | PositionSpecification
332 | AssoziationSpecification</screen>
333<screen>CostLine := SubPositionList Costs?</screen>
334<screen>SubPositionList := (SubPosition+ Space+)+</screen>
335<screen>SubPosition := Number | "+" Number | "-" Number | "*"</screen>
336<screen>Costs := (Number Space+)+</screen>
337<screen>PositionSpecification := Position "=" Space* PositionName</screen>
338<screen>Position := CostPosition | CalledPosition</screen>
339<screen>CostPosition := "ob" | "fl" | "fi" | "fe" | "fn"</screen>
340<screen>CalledPosition := " "cob" | "cfl" | "cfn"</screen>
341<screen>PositionName := ( "(" Number ")" )? (Space* NoNewLineChar* )?</screen>
342<screen>AssoziationSpecification := CallSpezification
343 | JumpSpecification</screen>
344<screen>CallSpecification := CallLine "\n" CostLine</screen>
345<screen>CallLine := "calls=" Space* Number Space+ SubPositionList</screen>
346<screen>JumpSpecification := ...</screen>
347<screen>Space := " " | "\t"</screen>
348<screen>Number := HexNumber | (Digit)+</screen>
349<screen>Digit := "0" | ... | "9"</screen>
350<screen>HexNumber := "0x" (Digit | HexChar)+</screen>
351<screen>HexChar := "a" | ... | "f" | "A" | ... | "F"</screen>
352<screen>Name = Alpha (Digit | Alpha)*</screen>
353<screen>Alpha = "a" | ... | "z" | "A" | ... | "Z"</screen>
354<screen>NoNewLineChar := all characters without "\n"</screen>
355</para>
356
357</sect2>
358
359<sect2 id="cl-format.reference.header" xreflabel="Description of Header Lines">
360<title>Description of Header Lines</title>
361
362<para>The header has an arbitrary number of lines of the format
363"key: value". Possible <emphasis>key</emphasis> values for the header are:</para>
364
365<itemizedlist>
366
367 <listitem>
368 <para><computeroutput>version: number</computeroutput> [Callgrind]</para>
369 <para>This is used to distinguish future profile data formats. A
370 major version of 0 or 1 is supposed to be upwards compatible with
371 Cachegrinds format. It is optional; if not appearing, version 1
372 is supposed. Otherwise, this has to be the first header line.</para>
373 </listitem>
374
375 <listitem>
376 <para><computeroutput>pid: process id</computeroutput> [Callgrind]</para>
377 <para>This specifies the process ID of the supervised application
378 for which this profile was generated.</para>
379 </listitem>
380
381 <listitem>
382 <para><computeroutput>cmd: program name + args</computeroutput> [Cachegrind]</para>
383 <para>This specifies the full command line of the supervised
384 application for which this profile was generated.</para>
385 </listitem>
386
387 <listitem>
388 <para><computeroutput>part: number</computeroutput> [Callgrind]</para>
389 <para>This specifies a sequentially incremented number for each dump
390 generated, starting at 1.</para>
391 </listitem>
392
393 <listitem>
394 <para><computeroutput>desc: type: value</computeroutput> [Cachegrind]</para>
395 <para>This specifies various information for this dump. For some
396 types, the semantic is defined, but any description type is allowed.
397 Unknown types should be ignored.</para>
398 <para>There are the types "I1 cache", "D1 cache", "L2 cache", which
399 specify parameters used for the cache simulator. These are the only
400 types originally used by Cachegrind. Additionally, Callgrind uses
401 the following types: "Timerange" gives a rough range of the basic
402 block counter, for which the cost of this dump was collected.
403 Type "Trigger" states the reason of why this trace was generated.
404 E.g. program termination or forced interactive dump.</para>
405 </listitem>
406
407 <listitem>
408 <para><computeroutput>positions: [instr] [line]</computeroutput> [Callgrind]</para>
409 <para>For cost lines, this defines the semantic of the first numbers.
410 Any combination of "instr", "bb" and "line" is allowed, but has to be
411 in this order which corresponds to position numbers at the start of
412 the cost lines later in the file.</para>
413 <para>If "instr" is specified, the position is the address of an
414 instruction whose execution raised the events given later on the
415 line. This address is relative to the offset of the binary/shared
416 library file to not have to specify relocation info. For "line",
417 the position is the line number of a source file, which is
418 responsible for the events raised. Note that the mapping of "instr"
419 and "line" positions are given by the debugging line information
420 produced by the compiler.</para>
421 <para>This field is optional. If not specified, "line" is supposed
422 only.</para>
423 </listitem>
424
425 <listitem>
426 <para><computeroutput>events: event type abbrevations</computeroutput> [Cachegrind]</para>
427 <para>A list of short names of the event types logged in this file.
428 The order is the same as in cost lines. The first event type is the
429 second or third number in a cost line, depending on the value of
430 "positions". Callgrind does not add additional cost types. Specify
431 exactly once.</para>
432 <para>Cost types from original Cachegrind are:
433 <itemizedlist>
434 <listitem>
435 <para><command>Ir</command>: Instruction read access</para>
436 </listitem>
437 <listitem>
438 <para><command>I1mr</command>: Instruction Level 1 read cache miss</para>
439 </listitem>
440 <listitem>
441 <para><command>I2mr</command>: Instruction Level 2 read cache miss</para>
442 </listitem>
443 <listitem>
444 <para>...</para>
445 </listitem>
446 </itemizedlist>
447 </para>
448 </listitem>
449
450 <listitem>
451 <para><computeroutput>summary: costs</computeroutput> [Callgrind]</para>
452 <para><computeroutput>totals: costs</computeroutput> [Cachegrind]</para>
453 <para>The value or the total number of events covered by this trace
454 file. Both keys have the same meaning, but the "totals:" line
455 happens to be at the end of the file, while "summary:" appears in
456 the header. This was added to allow postprocessing tools to know
457 in advance to total cost. The two lines always give the same cost
458 counts.</para>
459 </listitem>
460
461</itemizedlist>
462
463</sect2>
464
465<sect2 id="cl-format.reference.body" xreflabel="Description of Body Lines">
466<title>Description of Body Lines</title>
467
468<para>There exist lines
469<computeroutput>spec=position</computeroutput>. The values for position
470specifications are arbitrary strings. When starting with "(" and a
471digit, it's a string in compressed format. Otherwise it's the real
472position string. This allows for file and symbol names as position
473strings, as these never start with "(" + <emphasis>digit</emphasis>.
474The compressed format is either "(" <emphasis>number</emphasis> ")"
475<emphasis>space</emphasis> <emphasis>position</emphasis> or only
476"(" <emphasis>number</emphasis> ")". The first relates
477<emphasis>position</emphasis> to <emphasis>number</emphasis> in the
478context of the given format specification from this line to the end of
479the file; it makes the (<emphasis>number</emphasis>) an alias for
480<emphasis>position</emphasis>. Compressed format is always
481optional.</para>
482
483<para>Position specifications allowed:</para>
484<itemizedlist>
485
486 <listitem>
487 <para><computeroutput>ob=</computeroutput> [Callgrind]</para>
488 <para>The ELF object where the cost of next cost lines happens.</para>
489 </listitem>
490
491 <listitem>
492 <para><computeroutput>fl=</computeroutput> [Cachegrind]</para>
493 </listitem>
494
495 <listitem>
496 <para><computeroutput>fi=</computeroutput> [Cachegrind]</para>
497 </listitem>
498
499 <listitem>
500 <para><computeroutput>fe=</computeroutput> [Cachegrind]</para>
501 <para>The source file including the code which is responsible for
502 the cost of next cost lines. "fi="/"fe=" is used when the source
503 file changes inside of a function, i.e. for inlined code.</para>
504 </listitem>
505
506 <listitem>
507 <para><computeroutput>fn=</computeroutput> [Cachegrind]</para>
508 <para>The name of the function where the cost of next cost lines
509 happens.</para>
510 </listitem>
511
512 <listitem>
513 <para><computeroutput>cob=</computeroutput> [Callgrind]</para>
514 <para>The ELF object of the target of the next call cost lines.</para>
515 </listitem>
516
517 <listitem>
518 <para><computeroutput>cfl=</computeroutput> [Callgrind]</para>
519 <para>The source file including the code of the target of the
520 next call cost lines.</para>
521 </listitem>
522
523 <listitem>
524 <para><computeroutput>cfn=</computeroutput> [Callgrind]</para>
525 <para>The name of the target function of the next call cost
526 lines.</para>
527 </listitem>
528
529 <listitem>
530 <para><computeroutput>calls=</computeroutput> [Callgrind]</para>
531 <para>The number of nonrecursive calls which are responsible for the
532 cost specified by the next call cost line. This is the cost spent
533 inside of the called function.</para>
534 <para>After "calls=" there MUST be a cost line. This is the cost
535 spent in the called function. The first number is the source line
536 from where the call happened.</para>
537 </listitem>
538
539 <listitem>
540 <para><computeroutput>jump=count target position</computeroutput> [Callgrind]</para>
541 <para>Unconditional jump, executed count times, to the given target
542 position.</para>
543 </listitem>
544
545 <listitem>
546 <para><computeroutput>jcnd=exe.count jumpcount target position</computeroutput> [Callgrind]</para>
547 <para>Conditional jump, executed exe.count times with jumpcount
548 jumps to the given target position.</para>
549 </listitem>
550
551</itemizedlist>
552
553</sect2>
554
555</sect1>
556
weidendo362f7832007-05-24 18:04:42 +0000557</chapter>