blob: 71c74a19381f69c4612100cc8e1dd1c75197bfeb [file] [log] [blame]
Chris Lattnerbdfb3392004-01-05 05:06:33 +00001<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01//EN"
2 "http://www.w3.org/TR/html4/strict.dtd">
3<html>
4<head>
5 <title>Source Level Debugging with LLVM</title>
6 <link rel="stylesheet" href="llvm.css" type="text/css">
7</head>
8<body>
9
10<div class="doc_title">Source Level Debugging with LLVM</div>
11
Reid Spencerd3f876c2004-11-01 08:19:36 +000012<table class="layout" style="width:100%">
13 <tr class="layout">
14 <td class="left">
Chris Lattnerbdfb3392004-01-05 05:06:33 +000015<ul>
Misha Brukman82873732004-05-12 19:21:57 +000016 <li><a href="#introduction">Introduction</a>
Chris Lattnerbdfb3392004-01-05 05:06:33 +000017 <ol>
18 <li><a href="#phil">Philosophy behind LLVM debugging information</a></li>
19 <li><a href="#debugopt">Debugging optimized code</a></li>
20 <li><a href="#future">Future work</a></li>
Misha Brukman82873732004-05-12 19:21:57 +000021 </ol></li>
Chris Lattnerbdfb3392004-01-05 05:06:33 +000022 <li><a href="#llvm-db">Using the <tt>llvm-db</tt> tool</a>
23 <ol>
24 <li><a href="#limitations">Limitations of <tt>llvm-db</tt></a></li>
25 <li><a href="#sample">A sample <tt>llvm-db</tt> session</a></li>
26 <li><a href="#startup">Starting the debugger</a></li>
27 <li><a href="#commands">Commands recognized by the debugger</a></li>
28 </ol></li>
29
Misha Brukman82873732004-05-12 19:21:57 +000030 <li><a href="#architecture">Architecture of the LLVM debugger</a>
Chris Lattnerbdfb3392004-01-05 05:06:33 +000031 <ol>
Chris Lattner8ff75902004-01-06 05:31:32 +000032 <li><a href="#arch_debugger">The Debugger and InferiorProcess classes</a></li>
33 <li><a href="#arch_info">The RuntimeInfo, ProgramInfo, and SourceLanguage classes</a></li>
34 <li><a href="#arch_llvm-db">The <tt>llvm-db</tt> tool</a></li>
Chris Lattnerbdfb3392004-01-05 05:06:33 +000035 <li><a href="#arch_todo">Short-term TODO list</a></li>
Misha Brukman82873732004-05-12 19:21:57 +000036 </ol></li>
Chris Lattnerbdfb3392004-01-05 05:06:33 +000037
Misha Brukman82873732004-05-12 19:21:57 +000038 <li><a href="#format">Debugging information format</a>
Chris Lattnerbdfb3392004-01-05 05:06:33 +000039 <ol>
Chris Lattner8ff75902004-01-06 05:31:32 +000040 <li><a href="#format_common_anchors">Anchors for global objects</a></li>
41 <li><a href="#format_common_stoppoint">Representing stopping points in the source program</a></li>
42 <li><a href="#format_common_lifetime">Object lifetimes and scoping</a></li>
Misha Brukman82873732004-05-12 19:21:57 +000043 <li><a href="#format_common_descriptors">Object descriptor formats</a>
Chris Lattnerbdfb3392004-01-05 05:06:33 +000044 <ul>
Chris Lattner8ff75902004-01-06 05:31:32 +000045 <li><a href="#format_common_source_files">Representation of source files</a></li>
46 <li><a href="#format_common_program_objects">Representation of program objects</a></li>
47 <li><a href="#format_common_object_contexts">Program object contexts</a></li>
Misha Brukman82873732004-05-12 19:21:57 +000048 </ul></li>
Chris Lattner8ff75902004-01-06 05:31:32 +000049 <li><a href="#format_common_intrinsics">Debugger intrinsic functions</a></li>
50 <li><a href="#format_common_tags">Values for debugger tags</a></li>
Misha Brukman82873732004-05-12 19:21:57 +000051 </ol></li>
52 <li><a href="#ccxx_frontend">C/C++ front-end specific debug information</a>
Chris Lattnerbdfb3392004-01-05 05:06:33 +000053 <ol>
Misha Brukman82873732004-05-12 19:21:57 +000054 <li><a href="#ccxx_pse">Program Scope Entries</a>
Chris Lattner8ff75902004-01-06 05:31:32 +000055 <ul>
56 <li><a href="#ccxx_compilation_units">Compilation unit entries</a></li>
57 <li><a href="#ccxx_modules">Module, namespace, and importing entries</a></li>
Misha Brukman82873732004-05-12 19:21:57 +000058 </ul></li>
Chris Lattner8ff75902004-01-06 05:31:32 +000059 <li><a href="#ccxx_dataobjects">Data objects (program variables)</a></li>
Misha Brukman82873732004-05-12 19:21:57 +000060 </ol></li>
Chris Lattnerbdfb3392004-01-05 05:06:33 +000061</ul>
Misha Brukman82873732004-05-12 19:21:57 +000062</td>
Reid Spencerd3f876c2004-11-01 08:19:36 +000063<td class="right">
Misha Brukmane849a1a2004-05-12 21:26:16 +000064<img src="img/venusflytrap.jpg" alt="A leafy and green bug eater" width="247"
Misha Brukman82873732004-05-12 19:21:57 +000065height="369">
66</td>
Reid Spencerd3f876c2004-11-01 08:19:36 +000067</tr></table>
Misha Brukman82873732004-05-12 19:21:57 +000068
Chris Lattner7911ce22004-05-23 21:07:27 +000069<div class="doc_author">
70 <p>Written by <a href="mailto:sabre@nondot.org">Chris Lattner</a></p>
71</div>
72
Chris Lattnerbdfb3392004-01-05 05:06:33 +000073
74<!-- *********************************************************************** -->
Misha Brukman94218a72004-12-09 20:27:37 +000075<div class="doc_section"><a name="introduction">Introduction</a></div>
76<!-- *********************************************************************** -->
Chris Lattnerbdfb3392004-01-05 05:06:33 +000077
78<div class="doc_text">
79
80<p>This document is the central repository for all information pertaining to
Chris Lattner8ff75902004-01-06 05:31:32 +000081debug information in LLVM. It describes the <a href="#llvm-db">user
Chris Lattnercbf1edb2004-07-19 19:30:40 +000082interface</a> for the <tt>llvm-db</tt> tool, which provides a
83powerful <a href="#llvm-db">source-level debugger</a>
Chris Lattner8ff75902004-01-06 05:31:32 +000084to users of LLVM-based compilers. It then describes the <a
85href="#architecture">various components</a> that make up the debugger and the
86libraries which future clients may use. Finally, it describes the <a
87href="#format">actual format that the LLVM debug information</a> takes,
88which is useful for those interested in creating front-ends or dealing directly
89with the information.</p>
Chris Lattnerbdfb3392004-01-05 05:06:33 +000090
91</div>
92
93<!-- ======================================================================= -->
94<div class="doc_subsection">
95 <a name="phil">Philosophy behind LLVM debugging information</a>
96</div>
97
98<div class="doc_text">
99
Misha Brukman82873732004-05-12 19:21:57 +0000100<p>The idea of the LLVM debugging information is to capture how the important
Chris Lattnerbdfb3392004-01-05 05:06:33 +0000101pieces of the source-language's Abstract Syntax Tree map onto LLVM code.
102Several design aspects have shaped the solution that appears here. The
103important ones are:</p>
104
Misha Brukman82873732004-05-12 19:21:57 +0000105<ul>
Chris Lattnerbdfb3392004-01-05 05:06:33 +0000106<li>Debugging information should have very little impact on the rest of the
107compiler. No transformations, analyses, or code generators should need to be
108modified because of debugging information.</li>
109
110<li>LLVM optimizations should interact in <a href="#debugopt">well-defined and
111easily described ways</a> with the debugging information.</li>
112
113<li>Because LLVM is designed to support arbitrary programming languages,
114LLVM-to-LLVM tools should not need to know anything about the semantics of the
115source-level-language.</li>
116
117<li>Source-level languages are often <b>widely</b> different from one another.
118LLVM should not put any restrictions of the flavor of the source-language, and
119the debugging information should work with any language.</li>
120
121<li>With code generator support, it should be possible to use an LLVM compiler
Chris Lattner8ff75902004-01-06 05:31:32 +0000122to compile a program to native machine code and standard debugging formats.
Chris Lattnerbdfb3392004-01-05 05:06:33 +0000123This allows compatibility with traditional machine-code level debuggers, like
124GDB or DBX.</li>
125
Misha Brukman82873732004-05-12 19:21:57 +0000126</ul>
Chris Lattnerbdfb3392004-01-05 05:06:33 +0000127
Misha Brukman82873732004-05-12 19:21:57 +0000128<p>The approach used by the LLVM implementation is to use a small set of <a
Chris Lattner8ff75902004-01-06 05:31:32 +0000129href="#format_common_intrinsics">intrinsic functions</a> to define a mapping
Chris Lattnerbdfb3392004-01-05 05:06:33 +0000130between LLVM program objects and the source-level objects. The description of
131the source-level program is maintained in LLVM global variables in an <a
Chris Lattner8ff75902004-01-06 05:31:32 +0000132href="#ccxx_frontend">implementation-defined format</a> (the C/C++ front-end
Chris Lattnerbdfb3392004-01-05 05:06:33 +0000133currently uses working draft 7 of the <a
134href="http://www.eagercon.com/dwarf/dwarf3std.htm">Dwarf 3 standard</a>).</p>
135
Misha Brukman82873732004-05-12 19:21:57 +0000136<p>When a program is debugged, the debugger interacts with the user and turns
137the stored debug information into source-language specific information. As
138such, the debugger must be aware of the source-language, and is thus tied to a
Chris Lattnerbdfb3392004-01-05 05:06:33 +0000139specific language of family of languages. The <a href="#llvm-db">LLVM
Misha Brukman82873732004-05-12 19:21:57 +0000140debugger</a> is designed to be modular in its support for source-languages.</p>
Chris Lattnerbdfb3392004-01-05 05:06:33 +0000141
142</div>
143
144
145<!-- ======================================================================= -->
146<div class="doc_subsection">
147 <a name="debugopt">Debugging optimized code</a>
148</div>
149
150<div class="doc_text">
Chris Lattnerbdfb3392004-01-05 05:06:33 +0000151
Misha Brukman82873732004-05-12 19:21:57 +0000152<p>An extremely high priority of LLVM debugging information is to make it
153interact well with optimizations and analysis. In particular, the LLVM debug
154information provides the following guarantees:</p>
155
156<ul>
Chris Lattnerbdfb3392004-01-05 05:06:33 +0000157
158<li>LLVM debug information <b>always provides information to accurately read the
159source-level state of the program</b>, regardless of which LLVM optimizations
160have been run, and without any modification to the optimizations themselves.
161However, some optimizations may impact the ability to modify the current state
162of the program with a debugger, such as setting program variables, or calling
163function that have been deleted.</li>
164
165<li>LLVM optimizations gracefully interact with debugging information. If they
166are not aware of debug information, they are automatically disabled as necessary
167in the cases that would invalidate the debug info. This retains the LLVM
168features making it easy to write new transformations.</li>
169
170<li>As desired, LLVM optimizations can be upgraded to be aware of the LLVM
171debugging information, allowing them to update the debugging information as they
172perform aggressive optimizations. This means that, with effort, the LLVM
173optimizers could optimize debug code just as well as non-debug code.</li>
174
175<li>LLVM debug information does not prevent many important optimizations from
176happening (for example inlining, basic block reordering/merging/cleanup, tail
177duplication, etc), further reducing the amount of the compiler that eventually
178is "aware" of debugging information.</li>
179
180<li>LLVM debug information is automatically optimized along with the rest of the
181program, using existing facilities. For example, duplicate information is
182automatically merged by the linker, and unused information is automatically
183removed.</li>
184
Misha Brukman82873732004-05-12 19:21:57 +0000185</ul>
Chris Lattnerbdfb3392004-01-05 05:06:33 +0000186
Misha Brukman82873732004-05-12 19:21:57 +0000187<p>Basically, the debug information allows you to compile a program with
188"<tt>-O0 -g</tt>" and get full debug information, allowing you to arbitrarily
189modify the program as it executes from the debugger. Compiling a program with
190"<tt>-O3 -g</tt>" gives you full debug information that is always available and
191accurate for reading (e.g., you get accurate stack traces despite tail call
192elimination and inlining), but you might lose the ability to modify the program
193and call functions where were optimized out of the program, or inlined away
194completely.</p>
Chris Lattnerbdfb3392004-01-05 05:06:33 +0000195
196</div>
197
Chris Lattnerbdfb3392004-01-05 05:06:33 +0000198<!-- ======================================================================= -->
199<div class="doc_subsection">
200 <a name="future">Future work</a>
201</div>
202
203<div class="doc_text">
Misha Brukman82873732004-05-12 19:21:57 +0000204<p>There are several important extensions that could be eventually added to the
Chris Lattnerbdfb3392004-01-05 05:06:33 +0000205LLVM debugger. The most important extension would be to upgrade the LLVM code
206generators to support debugging information. This would also allow, for
207example, the X86 code generator to emit native objects that contain debugging
208information consumable by traditional source-level debuggers like GDB or
209DBX.</p>
210
Misha Brukman82873732004-05-12 19:21:57 +0000211<p>Additionally, LLVM optimizations can be upgraded to incrementally update the
Chris Lattnerbdfb3392004-01-05 05:06:33 +0000212debugging information, <a href="#commands">new commands</a> can be added to the
213debugger, and thread support could be added to the debugger.</p>
214
Misha Brukman82873732004-05-12 19:21:57 +0000215<p>The "SourceLanguage" modules provided by <tt>llvm-db</tt> could be
216substantially improved to provide good support for C++ language features like
217namespaces and scoping rules.</p>
Chris Lattnerbdfb3392004-01-05 05:06:33 +0000218
Misha Brukman82873732004-05-12 19:21:57 +0000219<p>After working with the debugger for a while, perhaps the nicest improvement
Chris Lattner8ff75902004-01-06 05:31:32 +0000220would be to add some sort of line editor, such as GNU readline (but one that is
Chris Lattnerbdfb3392004-01-05 05:06:33 +0000221compatible with the LLVM license).</p>
222
Misha Brukman82873732004-05-12 19:21:57 +0000223<p>For someone so inclined, it should be straight-forward to write different
Chris Lattnerbdfb3392004-01-05 05:06:33 +0000224front-ends for the LLVM debugger, as the LLVM debugging engine is cleanly
Chris Lattner8ff75902004-01-06 05:31:32 +0000225separated from the <tt>llvm-db</tt> front-end. A new LLVM GUI debugger or IDE
Misha Brukman4f5659e2004-12-04 00:33:34 +0000226would be nice.</p>
Chris Lattnerbdfb3392004-01-05 05:06:33 +0000227
228</div>
229
Chris Lattnerbdfb3392004-01-05 05:06:33 +0000230<!-- *********************************************************************** -->
231<div class="doc_section">
232 <a name="llvm-db">Using the <tt>llvm-db</tt> tool</a>
233</div>
234<!-- *********************************************************************** -->
235
236<div class="doc_text">
237
Misha Brukman82873732004-05-12 19:21:57 +0000238<p>The <tt>llvm-db</tt> tool provides a GDB-like interface for source-level
Chris Lattnerbdfb3392004-01-05 05:06:33 +0000239debugging of programs. This tool provides many standard commands for inspecting
240and modifying the program as it executes, loading new programs, single stepping,
Misha Brukman82873732004-05-12 19:21:57 +0000241placing breakpoints, etc. This section describes how to use the debugger.</p>
Chris Lattnerbdfb3392004-01-05 05:06:33 +0000242
243<p><tt>llvm-db</tt> has been designed to be as similar to GDB in its user
244interface as possible. This should make it extremely easy to learn
245<tt>llvm-db</tt> if you already know <tt>GDB</tt>. In general, <tt>llvm-db</tt>
246provides the subset of GDB commands that are applicable to LLVM debugging users.
247If there is a command missing that make a reasonable amount of sense within the
248<a href="#limitations">limitations of <tt>llvm-db</tt></a>, please report it as
Misha Brukman4f5659e2004-12-04 00:33:34 +0000249a bug or, better yet, submit a patch to add it.</p>
Chris Lattnerbdfb3392004-01-05 05:06:33 +0000250
251</div>
252
253<!-- ======================================================================= -->
254<div class="doc_subsection">
255 <a name="limitations">Limitations of <tt>llvm-db</tt></a>
256</div>
257
258<div class="doc_text">
259
Chris Lattner8ff75902004-01-06 05:31:32 +0000260<p><tt>llvm-db</tt> is designed to be modular and easy to extend. This
261extensibility was key to getting the debugger up-and-running quickly, because we
262can start with simple-but-unsophisicated implementations of various components.
263Because of this, it is currently missing many features, though they should be
264easy to add over time (patches welcomed!). The biggest inherent limitations of
265<tt>llvm-db</tt> are currently due to extremely simple <a
266href="#arch_debugger">debugger backend</a> (implemented in
267"lib/Debugger/UnixLocalInferiorProcess.cpp") which is designed to work without
268any cooperation from the code generators. Because it is so simple, it suffers
269from the following inherent limitations:</p>
Chris Lattnerbdfb3392004-01-05 05:06:33 +0000270
Misha Brukman82873732004-05-12 19:21:57 +0000271<ul>
Chris Lattnerbdfb3392004-01-05 05:06:33 +0000272
273<li>Running a program in <tt>llvm-db</tt> is a bit slower than running it with
Chris Lattner8ff75902004-01-06 05:31:32 +0000274<tt>lli</tt> (i.e., in the JIT).</li>
Chris Lattnerbdfb3392004-01-05 05:06:33 +0000275
276<li>Inspection of the target hardware is not supported. This means that you
277cannot, for example, print the contents of X86 registers.</li>
278
279<li>Inspection of LLVM code is not supported. This means that you cannot print
280the contents of arbitrary LLVM values, or use commands such as <tt>stepi</tt>.
281This also means that you cannot debug code without debug information.</li>
282
283<li>Portions of the debugger run in the same address space as the program being
284debugged. This means that memory corruption by the program could trample on
285portions of the debugger.</li>
286
287<li>Attaching to existing processes and core files is not currently
288supported.</li>
289
Misha Brukman82873732004-05-12 19:21:57 +0000290</ul>
Chris Lattnerbdfb3392004-01-05 05:06:33 +0000291
Chris Lattner8ff75902004-01-06 05:31:32 +0000292<p>That said, the debugger is still quite useful, and all of these limitations
293can be eliminated by integrating support for the debugger into the code
294generators, and writing a new <a href="#arch_debugger">InferiorProcess</a>
295subclass to use it. See the <a href="#future">future work</a> section for ideas
296of how to extend the LLVM debugger despite these limitations.</p>
Chris Lattnerbdfb3392004-01-05 05:06:33 +0000297
298</div>
299
300
301<!-- ======================================================================= -->
302<div class="doc_subsection">
303 <a name="sample">A sample <tt>llvm-db</tt> session</a>
304</div>
305
306<div class="doc_text">
307
Chris Lattner8ff75902004-01-06 05:31:32 +0000308<p>TODO: this is obviously lame, when more is implemented, this can be much
309better.</p>
310
Misha Brukman82873732004-05-12 19:21:57 +0000311<pre>
Chris Lattner8ff75902004-01-06 05:31:32 +0000312$ <b>llvm-db funccall</b>
313llvm-db: The LLVM source-level debugger
314Loading program... successfully loaded 'funccall.bc'!
315(llvm-db) <b>create</b>
316Starting program: funccall.bc
317main at funccall.c:9:2
3189 -> q = 0;
319(llvm-db) <b>list main</b>
3204 void foo() {
3215 int t = q;
3226 q = t + 1;
3237 }
3248 int main() {
3259 -> q = 0;
32610 foo();
32711 q = q - 1;
32812
32913 return q;
330(llvm-db) <b>list</b>
33114 }
332(llvm-db) <b>step</b>
33310 -> foo();
334(llvm-db) <b>s</b>
335foo at funccall.c:5:2
3365 -> int t = q;
337(llvm-db) <b>bt</b>
338#0 -> 0x85ffba0 in foo at funccall.c:5:2
339#1 0x85ffd98 in main at funccall.c:10:2
340(llvm-db) <b>finish</b>
341main at funccall.c:11:2
34211 -> q = q - 1;
343(llvm-db) <b>s</b>
34413 -> return q;
345(llvm-db) <b>s</b>
346The program stopped with exit code 0
347(llvm-db) <b>quit</b>
348$
Misha Brukman82873732004-05-12 19:21:57 +0000349</pre>
Chris Lattnerbdfb3392004-01-05 05:06:33 +0000350
351</div>
352
353
354
355<!-- ======================================================================= -->
356<div class="doc_subsection">
357 <a name="startup">Starting the debugger</a>
358</div>
359
360<div class="doc_text">
361
362<p>There are three ways to start up the <tt>llvm-db</tt> debugger:</p>
363
364<p>When run with no options, just <tt>llvm-db</tt>, the debugger starts up
365without a program loaded at all. You must use the <a
366href="#c_file"><tt>file</tt> command</a> to load a program, and the <a
Misha Brukman1ed83482004-06-03 23:33:10 +0000367href="#c_set_args"><tt>set args</tt></a> or <a href="#c_run"><tt>run</tt></a>
Chris Lattnerbdfb3392004-01-05 05:06:33 +0000368commands to specify the arguments for the program.</p>
369
370<p>If you start the debugger with one argument, as <tt>llvm-db
371&lt;program&gt;</tt>, the debugger will start up and load in the specified
372program. You can then optionally specify arguments to the program with the <a
Misha Brukman1ed83482004-06-03 23:33:10 +0000373href="#c_set_args"><tt>set args</tt></a> or <a href="#c_run"><tt>run</tt></a>
Chris Lattnerbdfb3392004-01-05 05:06:33 +0000374commands.</p>
375
376<p>The third way to start the program is with the <tt>--args</tt> option. This
377option allows you to specify the program to load and the arguments to start out
378with. <!-- No options to <tt>llvm-db</tt> may be specified after the
379<tt>-args</tt> option. --> Example use: <tt>llvm-db --args ls /home</tt></p>
380
381</div>
382
383<!-- ======================================================================= -->
384<div class="doc_subsection">
385 <a name="commands">Commands recognized by the debugger</a>
386</div>
387
388<div class="doc_text">
389
390<p>FIXME: this needs work obviously. See the <a
391href="http://sources.redhat.com/gdb/documentation/">GDB documentation</a> for
392information about what these do, or try '<tt>help [command]</tt>' within
393<tt>llvm-db</tt> to get information.</p>
394
395<p>
396<h2>General usage:</h2>
397<ul>
398<li>help [command]</li>
399<li>quit</li>
400<li><a name="c_file">file</a> [program]</li>
401</ul>
402
403<h2>Program inspection and interaction:</h2>
404<ul>
405<li>create (start the program, stopping it ASAP in <tt>main</tt>)</li>
406<li>kill</li>
407<li>run [args]</li>
408<li>step [num]</li>
409<li>next [num]</li>
410<li>cont</li>
411<li>finish</li>
412
413<li>list [start[, end]]</li>
414<li>info source</li>
415<li>info sources</li>
416<li>info functions</li>
417</ul>
418
419<h2>Call stack inspection:</h2>
420<ul>
421<li>backtrace</li>
422<li>up [n]</li>
423<li>down [n]</li>
424<li>frame [n]</li>
425</ul>
426
427
428<h2>Debugger inspection and interaction:</h2>
429<ul>
430<li>info target</li>
431<li>show prompt</li>
432<li>set prompt</li>
433<li>show listsize</li>
434<li>set listsize</li>
435<li>show language</li>
436<li>set language</li>
Chris Lattner8ff75902004-01-06 05:31:32 +0000437<li>show args</li>
438<li>set args [args]</li>
Chris Lattnerbdfb3392004-01-05 05:06:33 +0000439</ul>
440
441<h2>TODO:</h2>
442<ul>
443<li>info frame</li>
444<li>break</li>
445<li>print</li>
446<li>ptype</li>
447
448<li>info types</li>
449<li>info variables</li>
450<li>info program</li>
451
452<li>info args</li>
453<li>info locals</li>
454<li>info catch</li>
455<li>... many others</li>
456</ul>
Misha Brukman82873732004-05-12 19:21:57 +0000457
Chris Lattnerbdfb3392004-01-05 05:06:33 +0000458</div>
459
460<!-- *********************************************************************** -->
461<div class="doc_section">
462 <a name="architecture">Architecture of the LLVM debugger</a>
463</div>
464<!-- *********************************************************************** -->
465
466<div class="doc_text">
Misha Brukman82873732004-05-12 19:21:57 +0000467<p>The LLVM debugger is built out of three distinct layers of software. These
Chris Lattner8ff75902004-01-06 05:31:32 +0000468layers provide clients with different interface options depending on what pieces
469of they want to implement themselves, and it also promotes code modularity and
470good design. The three layers are the <a href="#arch_debugger">Debugger
Misha Brukman82873732004-05-12 19:21:57 +0000471interface</a>, the <a href="#arch_info">"info" interfaces</a>, and the <a
472href="#arch_llvm-db"><tt>llvm-db</tt> tool</a> itself.</p>
Chris Lattner8ff75902004-01-06 05:31:32 +0000473</div>
Chris Lattnerbdfb3392004-01-05 05:06:33 +0000474
Chris Lattner8ff75902004-01-06 05:31:32 +0000475<!-- ======================================================================= -->
476<div class="doc_subsection">
477 <a name="arch_debugger">The Debugger and InferiorProcess classes</a>
478</div>
Chris Lattnerbdfb3392004-01-05 05:06:33 +0000479
Chris Lattner8ff75902004-01-06 05:31:32 +0000480<div class="doc_text">
Misha Brukman82873732004-05-12 19:21:57 +0000481<p>The Debugger class (defined in the <tt>include/llvm/Debugger/</tt> directory)
482is a low-level class which is used to maintain information about the loaded
Chris Lattner8ff75902004-01-06 05:31:32 +0000483program, as well as start and stop the program running as necessary. This class
484does not provide any high-level analysis or control over the program, only
485exposing simple interfaces like <tt>load/unloadProgram</tt>,
486<tt>create/killProgram</tt>, <tt>step/next/finish/contProgram</tt>, and
Misha Brukman82873732004-05-12 19:21:57 +0000487low-level methods for installing breakpoints.</p>
Chris Lattner8ff75902004-01-06 05:31:32 +0000488
489<p>
490The Debugger class is itself a wrapper around the lowest-level InferiorProcess
491class. This class is used to represent an instance of the program running under
492debugger control. The InferiorProcess class can be implemented in different
493ways for different targets and execution scenarios (e.g., remote debugging).
494The InferiorProcess class exposes a small and simple collection of interfaces
495which are useful for inspecting the current state of the program (such as
496collecting stack trace information, reading the memory image of the process,
497etc). The interfaces in this class are designed to be as low-level and simple
498as possible, to make it easy to create new instances of the class.
499</p>
500
501<p>
502The Debugger class exposes the currently active instance of InferiorProcess
503through the <tt>Debugger::getRunningProcess</tt> method, which returns a
504<tt>const</tt> reference to the class. This means that clients of the Debugger
505class can only <b>inspect</b> the running instance of the program directly. To
506change the executing process in some way, they must use the interces exposed by
507the Debugger class.
508</p>
509</div>
510
511<!-- ======================================================================= -->
512<div class="doc_subsection">
513 <a name="arch_info">The RuntimeInfo, ProgramInfo, and SourceLanguage classes</a>
514</div>
515
516<div class="doc_text">
517<p>
518The next-highest level of debugger abstraction is provided through the
519ProgramInfo, RuntimeInfo, SourceLanguage and related classes (also defined in
520the <tt>include/llvm/Debugger/</tt> directory). These classes efficiently
521decode the debugging information and low-level interfaces exposed by
522InferiorProcess into a higher-level representation, suitable for analysis by the
523debugger.
524</p>
525
526<p>
527The ProgramInfo class exposes a variety of different kinds of information about
528the program objects in the source-level-language. The SourceFileInfo class
529represents a source-file in the program (e.g. a .cpp or .h file). The
530SourceFileInfo class captures information such as which SourceLanguage was used
531to compile the file, where the debugger can get access to the actual file text
532(which is lazily loaded on demand), etc. The SourceFunctionInfo class
533represents a... <b>FIXME: finish</b>. The ProgramInfo class provides interfaces
534to lazily find and decode the information needed to create the Source*Info
535classes requested by the debugger.
536</p>
537
538<p>
539The RuntimeInfo class exposes information about the currently executed program,
540by decoding information from the InferiorProcess and ProgramInfo classes. It
541provides a StackFrame class which provides an easy-to-use interface for
542inspecting the current and suspended stack frames in the program.
543</p>
544
545<p>
546The SourceLanguage class is an abstract interface used by the debugger to
547perform all source-language-specific tasks. For example, this interface is used
548by the ProgramInfo class to decode language-specific types and functions and by
549the debugger front-end (such as <a href="#arch_llvm-db"><tt>llvm-db</tt></a> to
550evaluate source-langauge expressions typed into the debugger. This class uses
551the RuntimeInfo &amp; ProgramInfo classes to get information about the current
552execution context and the loaded program, respectively.
553</p>
Chris Lattnerbdfb3392004-01-05 05:06:33 +0000554
555</div>
556
557<!-- ======================================================================= -->
558<div class="doc_subsection">
Chris Lattner8ff75902004-01-06 05:31:32 +0000559 <a name="arch_llvm-db">The <tt>llvm-db</tt> tool</a>
560</div>
561
562<div class="doc_text">
563<p>
564The <tt>llvm-db</tt> is designed to be a debugger providing an interface as <a
565href="#llvm-db">similar to GDB</a> as reasonable, but no more so than that.
566Because the <a href="#arch_debugger">Debugger</a> and <a
567href="#arch_info">info</a> classes implement all of the heavy lifting and
568analysis, <tt>llvm-db</tt> (which lives in <tt>llvm/tools/llvm-db</tt>) consists
569mainly of of code to interact with the user and parse commands. The CLIDebugger
570constructor registers all of the builtin commands for the debugger, and each
571command is implemented as a CLIDebugger::[name]Command method.
572</p>
573</div>
574
575
576<!-- ======================================================================= -->
577<div class="doc_subsection">
Chris Lattnerbdfb3392004-01-05 05:06:33 +0000578 <a name="arch_todo">Short-term TODO list</a>
579</div>
580
581<div class="doc_text">
582
583<p>
584FIXME: this section will eventually go away. These are notes to myself of
585things that should be implemented, but haven't yet.
586</p>
587
588<p>
589<b>Breakpoints:</b> Support is already implemented in the 'InferiorProcess'
590class, though it hasn't been tested yet. To finish breakpoint support, we need
591to implement breakCommand (which should reuse the linespec parser from the list
592command), and handle the fact that 'break foo' or 'break file.c:53' may insert
593multiple breakpoints. Also, if you say 'break file.c:53' and there is no
594stoppoint on line 53, the breakpoint should go on the next available line. My
595idea was to have the Debugger class provide a "Breakpoint" class which
596encapsulated this messiness, giving the debugger front-end a simple interface.
597The debugger front-end would have to map the really complex semantics of
598temporary breakpoints and 'conditional' breakpoints onto this intermediate
599level. Also, breakpoints should survive as much as possible across program
600reloads.
601</p>
602
603<p>
Chris Lattnerbdfb3392004-01-05 05:06:33 +0000604<b>UnixLocalInferiorProcess.cpp speedup</b>: There is no reason for the debugged
605process to code gen the globals corresponding to debug information. The
606IntrinsicLowering object could instead change descriptors into constant expr
607casts of the constant address of the LLVM objects for the descriptors. This
608would also allow us to eliminate the mapping back and forth between physical
609addresses that must be done.</p>
610
Chris Lattner8ff75902004-01-06 05:31:32 +0000611<p>
612<b>Process deaths</b>: The InferiorProcessDead exception should be extended to
613know "how" a process died, i.e., it was killed by a signal. This is easy to
614collect in the UnixLocalInferiorProcess, we just need to represent it.</p>
615
Chris Lattnerbdfb3392004-01-05 05:06:33 +0000616</div>
617
618<!-- *********************************************************************** -->
619<div class="doc_section">
Chris Lattner8ff75902004-01-06 05:31:32 +0000620 <a name="format">Debugging information format</a>
Chris Lattnerbdfb3392004-01-05 05:06:33 +0000621</div>
622<!-- *********************************************************************** -->
623
624<div class="doc_text">
625
626<p>LLVM debugging information has been carefully designed to make it possible
627for the optimizer to optimize the program and debugging information without
628necessarily having to know anything about debugging information. In particular,
629the global constant merging pass automatically eliminates duplicated debugging
630information (often caused by header files), the global dead code elimination
631pass automatically deletes debugging information for a function if it decides to
632delete the function, and the linker eliminates debug information when it merges
633<tt>linkonce</tt> functions.</p>
634
635<p>To do this, most of the debugging information (descriptors for types,
636variables, functions, source files, etc) is inserted by the language front-end
637in the form of LLVM global variables. These LLVM global variables are no
638different from any other global variables, except that they have a web of LLVM
639intrinsic functions that point to them. If the last references to a particular
640piece of debugging information are deleted (for example, by the
641<tt>-globaldce</tt> pass), the extraneous debug information will automatically
642become dead and be removed by the optimizer.</p>
643
644<p>The debugger is designed to be agnostic about the contents of most of the
Chris Lattner8ff75902004-01-06 05:31:32 +0000645debugging information. It uses a <a href="#arch_info">source-language-specific
646module</a> to decode the information that represents variables, types,
647functions, namespaces, etc: this allows for arbitrary source-language semantics
648and type-systems to be used, as long as there is a module written for the
Misha Brukman82873732004-05-12 19:21:57 +0000649debugger to interpret the information.</p>
Chris Lattnerbdfb3392004-01-05 05:06:33 +0000650
Misha Brukman82873732004-05-12 19:21:57 +0000651<p>To provide basic functionality, the LLVM debugger does have to make some
Chris Lattnerbdfb3392004-01-05 05:06:33 +0000652assumptions about the source-level language being debugged, though it keeps
653these to a minimum. The only common features that the LLVM debugger assumes
Chris Lattner8ff75902004-01-06 05:31:32 +0000654exist are <a href="#format_common_source_files">source files</a>, and <a
655href="#format_program_objects">program objects</a>. These abstract objects are
656used by the debugger to form stack traces, show information about local
Misha Brukman82873732004-05-12 19:21:57 +0000657variables, etc.</p>
Chris Lattnerbdfb3392004-01-05 05:06:33 +0000658
659<p>This section of the documentation first describes the representation aspects
Chris Lattner8ff75902004-01-06 05:31:32 +0000660common to any source-language. The <a href="#ccxx_frontend">next section</a>
661describes the data layout conventions used by the C and C++ front-ends.</p>
Chris Lattnerbdfb3392004-01-05 05:06:33 +0000662
663</div>
664
665<!-- ======================================================================= -->
666<div class="doc_subsection">
Chris Lattner8ff75902004-01-06 05:31:32 +0000667 <a name="format_common_anchors">Anchors for global objects</a>
Chris Lattnerbdfb3392004-01-05 05:06:33 +0000668</div>
669
670<div class="doc_text">
Misha Brukman82873732004-05-12 19:21:57 +0000671<p>One important aspect of the LLVM debug representation is that it allows the
672LLVM debugger to efficiently index all of the global objects without having the
673scan the program. To do this, all of the global objects use "anchor" globals of
674type "<tt>{}</tt>", with designated names. These anchor objects obviously do
675not contain any content or meaning by themselves, but all of the global objects
676of a particular type (e.g., source file descriptors) contain a pointer to the
677anchor. This pointer allows the debugger to use def-use chains to find all
678global objects of that type.</p>
Chris Lattnerbdfb3392004-01-05 05:06:33 +0000679
Misha Brukman82873732004-05-12 19:21:57 +0000680<p>So far, the following names are recognized as anchors by the LLVM
681debugger:</p>
Chris Lattnerbdfb3392004-01-05 05:06:33 +0000682
Misha Brukman82873732004-05-12 19:21:57 +0000683<pre>
Chris Lattner8ff75902004-01-06 05:31:32 +0000684 %<a href="#format_common_source_files">llvm.dbg.translation_units</a> = linkonce global {} {}
685 %<a href="#format_program_objects">llvm.dbg.globals</a> = linkonce global {} {}
Misha Brukman82873732004-05-12 19:21:57 +0000686</pre>
Chris Lattnerbdfb3392004-01-05 05:06:33 +0000687
Misha Brukman82873732004-05-12 19:21:57 +0000688<p>Using anchors in this way (where the source file descriptor points to the
Chris Lattnerbdfb3392004-01-05 05:06:33 +0000689anchors, as opposed to having a list of source file descriptors) allows for the
690standard dead global elimination and merging passes to automatically remove
691unused debugging information. If the globals were kept track of through lists,
692there would always be an object pointing to the descriptors, thus would never be
Misha Brukman82873732004-05-12 19:21:57 +0000693deleted.</p>
Chris Lattnerbdfb3392004-01-05 05:06:33 +0000694
695</div>
696
Chris Lattnerbdfb3392004-01-05 05:06:33 +0000697<!-- ======================================================================= -->
698<div class="doc_subsection">
Chris Lattner8ff75902004-01-06 05:31:32 +0000699 <a name="format_common_stoppoint">
Chris Lattnerbdfb3392004-01-05 05:06:33 +0000700 Representing stopping points in the source program
701 </a>
702</div>
703
704<div class="doc_text">
705
706<p>LLVM debugger "stop points" are a key part of the debugging representation
707that allows the LLVM to maintain simple semantics for <a
708href="#debugopt">debugging optimized code</a>. The basic idea is that the
709front-end inserts calls to the <tt>%llvm.dbg.stoppoint</tt> intrinsic function
710at every point in the program where the debugger should be able to inspect the
711program (these correspond to places the debugger stops when you "<tt>step</tt>"
712through it). The front-end can choose to place these as fine-grained as it
Chris Lattner8ff75902004-01-06 05:31:32 +0000713would like (for example, before every subexpression evaluated), but it is
714recommended to only put them after every source statement that includes
715executable code.</p>
Chris Lattnerbdfb3392004-01-05 05:06:33 +0000716
Misha Brukman82873732004-05-12 19:21:57 +0000717<p>Using calls to this intrinsic function to demark legal points for the
718debugger to inspect the program automatically disables any optimizations that
719could potentially confuse debugging information. To non-debug-information-aware
Chris Lattnerbdfb3392004-01-05 05:06:33 +0000720transformations, these calls simply look like calls to an external function,
721which they must assume to do anything (including reading or writing to any part
722of reachable memory). On the other hand, it does not impact many optimizations,
723such as code motion of non-trapping instructions, nor does it impact
Chris Lattner8ff75902004-01-06 05:31:32 +0000724optimization of subexpressions, code duplication transformations, or basic-block
725reordering transformations.</p>
Chris Lattnerbdfb3392004-01-05 05:06:33 +0000726
Misha Brukman82873732004-05-12 19:21:57 +0000727<p>An important aspect of the calls to the <tt>%llvm.dbg.stoppoint</tt>
728intrinsic is that the function-local debugging information is woven together
729with use-def chains. This makes it easy for the debugger to, for example,
730locate the 'next' stop point. For a concrete example of stop points, see the
731example in <a href="#format_common_lifetime">the next section</a>.</p>
Chris Lattnerbdfb3392004-01-05 05:06:33 +0000732
733</div>
734
735
736<!-- ======================================================================= -->
737<div class="doc_subsection">
Chris Lattner8ff75902004-01-06 05:31:32 +0000738 <a name="format_common_lifetime">Object lifetimes and scoping</a>
Chris Lattnerbdfb3392004-01-05 05:06:33 +0000739</div>
740
741<div class="doc_text">
Misha Brukman82873732004-05-12 19:21:57 +0000742<p>In many languages, the local variables in functions can have their lifetime
743or scope limited to a subset of a function. In the C family of languages, for
Chris Lattnerbdfb3392004-01-05 05:06:33 +0000744example, variables are only live (readable and writable) within the source block
745that they are defined in. In functional languages, values are only readable
746after they have been defined. Though this is a very obvious concept, it is also
747non-trivial to model in LLVM, because it has no notion of scoping in this sense,
Misha Brukman82873732004-05-12 19:21:57 +0000748and does not want to be tied to a language's scoping rules.</p>
Chris Lattnerbdfb3392004-01-05 05:06:33 +0000749
Misha Brukman82873732004-05-12 19:21:57 +0000750<p>In order to handle this, the LLVM debug format uses the notion of "regions"
751of a function, delineated by calls to intrinsic functions. These intrinsic
752functions define new regions of the program and indicate when the region
753lifetime expires. Consider the following C fragment, for example:</p>
Chris Lattnerbdfb3392004-01-05 05:06:33 +0000754
Misha Brukman82873732004-05-12 19:21:57 +0000755<pre>
Chris Lattnerbdfb3392004-01-05 05:06:33 +00007561. void foo() {
7572. int X = ...;
7583. int Y = ...;
7594. {
7605. int Z = ...;
7616. ...
7627. }
7638. ...
7649. }
Misha Brukman82873732004-05-12 19:21:57 +0000765</pre>
Chris Lattnerbdfb3392004-01-05 05:06:33 +0000766
Misha Brukman82873732004-05-12 19:21:57 +0000767<p>Compiled to LLVM, this function would be represented like this (FIXME: CHECK
768AND UPDATE THIS):</p>
Chris Lattnerbdfb3392004-01-05 05:06:33 +0000769
Misha Brukman82873732004-05-12 19:21:57 +0000770<pre>
Chris Lattnerbdfb3392004-01-05 05:06:33 +0000771void %foo() {
772 %X = alloca int
773 %Y = alloca int
774 %Z = alloca int
Chris Lattner8ff75902004-01-06 05:31:32 +0000775 <a name="#icl_ex_D1">%D1</a> = call {}* %llvm.dbg.func.start(<a href="#format_program_objects">%lldb.global</a>* %d.foo)
776 %D2 = call {}* <a href="#format_common_stoppoint">%llvm.dbg.stoppoint</a>({}* %D1, uint 2, uint 2, <a href="#format_common_source_files">%lldb.compile_unit</a>* %file)
Chris Lattnerbdfb3392004-01-05 05:06:33 +0000777
778 %D3 = call {}* %llvm.dbg.DEFINEVARIABLE({}* %D2, ...)
779 <i>;; Evaluate expression on line 2, assigning to X.</i>
Chris Lattner8ff75902004-01-06 05:31:32 +0000780 %D4 = call {}* <a href="#format_common_stoppoint">%llvm.dbg.stoppoint</a>({}* %D3, uint 3, uint 2, <a href="#format_common_source_files">%lldb.compile_unit</a>* %file)
Chris Lattnerbdfb3392004-01-05 05:06:33 +0000781
782 %D5 = call {}* %llvm.dbg.DEFINEVARIABLE({}* %D4, ...)
783 <i>;; Evaluate expression on line 3, assigning to Y.</i>
Chris Lattner8ff75902004-01-06 05:31:32 +0000784 %D6 = call {}* <a href="#format_common_stoppoint">%llvm.dbg.stoppoint</a>({}* %D5, uint 5, uint 4, <a href="#format_common_source_files">%lldb.compile_unit</a>* %file)
Chris Lattnerbdfb3392004-01-05 05:06:33 +0000785
786 <a name="#icl_ex_D1">%D7</a> = call {}* %llvm.region.start({}* %D6)
787 %D8 = call {}* %llvm.dbg.DEFINEVARIABLE({}* %D7, ...)
788 <i>;; Evaluate expression on line 5, assigning to Z.</i>
Chris Lattner8ff75902004-01-06 05:31:32 +0000789 %D9 = call {}* <a href="#format_common_stoppoint">%llvm.dbg.stoppoint</a>({}* %D8, uint 6, uint 4, <a href="#format_common_source_files">%lldb.compile_unit</a>* %file)
Chris Lattnerbdfb3392004-01-05 05:06:33 +0000790
791 <i>;; Code for line 6.</i>
792 %D10 = call {}* %llvm.region.end({}* %D9)
Chris Lattner8ff75902004-01-06 05:31:32 +0000793 %D11 = call {}* <a href="#format_common_stoppoint">%llvm.dbg.stoppoint</a>({}* %D10, uint 8, uint 2, <a href="#format_common_source_files">%lldb.compile_unit</a>* %file)
Chris Lattnerbdfb3392004-01-05 05:06:33 +0000794
795 <i>;; Code for line 8.</i>
796 <a name="#icl_ex_D1">%D12</a> = call {}* %llvm.region.end({}* %D11)
797 ret void
798}
Misha Brukman82873732004-05-12 19:21:57 +0000799</pre>
Chris Lattnerbdfb3392004-01-05 05:06:33 +0000800
Misha Brukman82873732004-05-12 19:21:57 +0000801<p>This example illustrates a few important details about the LLVM debugging
Chris Lattnerbdfb3392004-01-05 05:06:33 +0000802information. In particular, it shows how the various intrinsics used are woven
803together with def-use and use-def chains, similar to how <a
Misha Brukman82873732004-05-12 19:21:57 +0000804href="#format_common_anchors">anchors</a> are used with globals. This allows
805the debugger to analyze the relationship between statements, variable
806definitions, and the code used to implement the function.</p>
Chris Lattnerbdfb3392004-01-05 05:06:33 +0000807
Misha Brukman82873732004-05-12 19:21:57 +0000808<p>In this example, two explicit regions are defined, one with the <a
Chris Lattnerbdfb3392004-01-05 05:06:33 +0000809href="#icl_ex_D1">definition of the <tt>%D1</tt> variable</a> and one with the
810<a href="#icl_ex_D7">definition of <tt>%D7</tt></a>. In the case of
811<tt>%D1</tt>, the debug information indicates that the function whose <a
Chris Lattner8ff75902004-01-06 05:31:32 +0000812href="#format_program_objects">descriptor</a> is specified as an argument to the
Chris Lattnerbdfb3392004-01-05 05:06:33 +0000813intrinsic. This defines a new stack frame whose lifetime ends when the region
814is ended by <a href="#icl_ex_D12">the <tt>%D12</tt> call</a>.</p>
815
Misha Brukman82873732004-05-12 19:21:57 +0000816<p>Using regions to represent the boundaries of source-level functions allow
817LLVM interprocedural optimizations to arbitrarily modify LLVM functions without
Chris Lattner8ff75902004-01-06 05:31:32 +0000818having to worry about breaking mapping information between the LLVM code and the
819and source-level program. In particular, the inliner requires no modification
820to support inlining with debugging information: there is no explicit correlation
821drawn between LLVM functions and their source-level counterparts (note however,
822that if the inliner inlines all instances of a non-strong-linkage function into
823its caller that it will not be possible for the user to manually invoke the
824inlined function from the debugger).</p>
Chris Lattnerbdfb3392004-01-05 05:06:33 +0000825
Misha Brukman82873732004-05-12 19:21:57 +0000826<p>Once the function has been defined, the <a
827href="#format_common_stoppoint">stopping point</a> corresponding to line #2 of
828the function is encountered. At this point in the function, <b>no</b> local
Chris Lattnerbdfb3392004-01-05 05:06:33 +0000829variables are live. As lines 2 and 3 of the example are executed, their
830variable definitions are automatically introduced into the program, without the
831need to specify a new region. These variables do not require new regions to be
832introduced because they go out of scope at the same point in the program: line
Misha Brukman82873732004-05-12 19:21:57 +00008339.</p>
Chris Lattnerbdfb3392004-01-05 05:06:33 +0000834
Misha Brukman82873732004-05-12 19:21:57 +0000835<p>In contrast, the <tt>Z</tt> variable goes out of scope at a different time,
836on line 7. For this reason, it is defined within <a href="#icl_ex_D7">the
Chris Lattnerbdfb3392004-01-05 05:06:33 +0000837<tt>%D7</tt> region</a>, which kills the availability of <tt>Z</tt> before the
Chris Lattner8ff75902004-01-06 05:31:32 +0000838code for line 8 is executed. In this way, regions can support arbitrary
839source-language scoping rules, as long as they can only be nested (ie, one scope
Misha Brukman82873732004-05-12 19:21:57 +0000840cannot partially overlap with a part of another scope).</p>
Chris Lattnerbdfb3392004-01-05 05:06:33 +0000841
Misha Brukman82873732004-05-12 19:21:57 +0000842<p>It is worth noting that this scoping mechanism is used to control scoping of
843all declarations, not just variable declarations. For example, the scope of a
844C++ using declaration is controlled with this, and the <tt>llvm-db</tt> C++
845support routines could use this to change how name lookup is performed (though
846this is not implemented yet).</p>
Chris Lattnerbdfb3392004-01-05 05:06:33 +0000847
848</div>
849
Chris Lattnerbdfb3392004-01-05 05:06:33 +0000850<!-- ======================================================================= -->
851<div class="doc_subsection">
Chris Lattner8ff75902004-01-06 05:31:32 +0000852 <a name="format_common_descriptors">Object descriptor formats</a>
Chris Lattnerbdfb3392004-01-05 05:06:33 +0000853</div>
854
855<div class="doc_text">
Misha Brukman82873732004-05-12 19:21:57 +0000856<p>The LLVM debugger expects the descriptors for program objects to start in a
Chris Lattnerbdfb3392004-01-05 05:06:33 +0000857canonical format, but the descriptors can include additional information
Chris Lattner8ff75902004-01-06 05:31:32 +0000858appended at the end that is source-language specific. All LLVM debugging
859information is versioned, allowing backwards compatibility in the case that the
860core structures need to change in some way. Also, all debugging information
861objects start with a <a href="#format_common_tags">tag</a> to indicate what type
862of object it is. The source-language is allows to define its own objects, by
863using unreserved tag numbers.</p>
864
865<p>The lowest-level descriptor are those describing <a
866href="#format_common_source_files">the files containing the program source
867code</a>, as most other descriptors (sometimes indirectly) refer to them.
Chris Lattnerbdfb3392004-01-05 05:06:33 +0000868</p>
869</div>
870
871
Misha Brukman82873732004-05-12 19:21:57 +0000872<!-- ------------------------------------------------------------------------ ->
Chris Lattnerbdfb3392004-01-05 05:06:33 +0000873<div class="doc_subsubsection">
Chris Lattner8ff75902004-01-06 05:31:32 +0000874 <a name="format_common_source_files">Representation of source files</a>
Chris Lattnerbdfb3392004-01-05 05:06:33 +0000875</div>
876
877<div class="doc_text">
878<p>
Chris Lattner8ff75902004-01-06 05:31:32 +0000879Source file descriptors are patterned after the Dwarf "compile_unit" object.
880The descriptor currently is defined to have at least the following LLVM
881type entries:</p>
Chris Lattnerbdfb3392004-01-05 05:06:33 +0000882
Misha Brukman82873732004-05-12 19:21:57 +0000883<pre>
Chris Lattnerbdfb3392004-01-05 05:06:33 +0000884%lldb.compile_unit = type {
Chris Lattner8ff75902004-01-06 05:31:32 +0000885 uint, <i>;; Tag: <a href="#tag_compile_unit">LLVM_COMPILE_UNIT</a></i>
Chris Lattnerbdfb3392004-01-05 05:06:33 +0000886 ushort, <i>;; LLVM debug version number</i>
887 ushort, <i>;; Dwarf language identifier</i>
888 sbyte*, <i>;; Filename</i>
889 sbyte*, <i>;; Working directory when compiled</i>
Chris Lattner8ff75902004-01-06 05:31:32 +0000890 sbyte* <i>;; Producer of the debug information</i>
Chris Lattnerbdfb3392004-01-05 05:06:33 +0000891}
Misha Brukman82873732004-05-12 19:21:57 +0000892</pre>
Chris Lattnerbdfb3392004-01-05 05:06:33 +0000893
894<p>
895These descriptors contain the version number for the debug info, a source
896language ID for the file (we use the Dwarf 3.0 ID numbers, such as
897<tt>DW_LANG_C89</tt>, <tt>DW_LANG_C_plus_plus</tt>, <tt>DW_LANG_Cobol74</tt>,
898etc), three strings describing the filename, working directory of the compiler,
Chris Lattner8ff75902004-01-06 05:31:32 +0000899and an identifier string for the compiler that produced it. Note that actual
900compile_unit declarations must also include an <a
901href="#format_common_anchors">anchor</a> to <tt>llvm.dbg.translation_units</tt>,
902but it is not specified where the anchor is to be located. Here is an example
Chris Lattnerbdfb3392004-01-05 05:06:33 +0000903descriptor:
904</p>
905
906<p><pre>
907%arraytest_source_file = internal constant %lldb.compile_unit {
Chris Lattner8ff75902004-01-06 05:31:32 +0000908 <a href="#tag_compile_unit">uint 17</a>, ; Tag value
Chris Lattnerbdfb3392004-01-05 05:06:33 +0000909 ushort 0, ; Version #0
910 ushort 1, ; DW_LANG_C89
911 sbyte* getelementptr ([12 x sbyte]* %.str_1, long 0, long 0), ; filename
912 sbyte* getelementptr ([12 x sbyte]* %.str_2, long 0, long 0), ; working dir
913 sbyte* getelementptr ([12 x sbyte]* %.str_3, long 0, long 0), ; producer
914 {}* %llvm.dbg.translation_units ; Anchor
915}
916%.str_1 = internal constant [12 x sbyte] c"arraytest.c\00"
917%.str_2 = internal constant [12 x sbyte] c"/home/sabre\00"
918%.str_3 = internal constant [12 x sbyte] c"llvmgcc 3.4\00"
919</pre></p>
920
Chris Lattner8ff75902004-01-06 05:31:32 +0000921<p>
922Note that the LLVM constant merging pass should eliminate duplicate copies of
923the strings that get emitted to each translation unit, such as the producer.
924</p>
Chris Lattnerbdfb3392004-01-05 05:06:33 +0000925
926</div>
927
928
Misha Brukman82873732004-05-12 19:21:57 +0000929<!-- ----------------------------------------------------------------------- -->
Chris Lattnerbdfb3392004-01-05 05:06:33 +0000930<div class="doc_subsubsection">
Chris Lattner8ff75902004-01-06 05:31:32 +0000931 <a name="format_program_objects">Representation of program objects</a>
Chris Lattnerbdfb3392004-01-05 05:06:33 +0000932</div>
933
934<div class="doc_text">
935<p>
Chris Lattner8ff75902004-01-06 05:31:32 +0000936The LLVM debugger needs to know about some source-language program objects, in
937order to build stack traces, print information about local variables, and other
938related activities. The LLVM debugger differentiates between three different
939types of program objects: subprograms (functions, messages, methods, etc),
940variables (locals and globals), and others. Because source-languages have
941widely varying forms of these objects, the LLVM debugger expects only a few
942fields in the descriptor for each object:
Chris Lattnerbdfb3392004-01-05 05:06:33 +0000943</p>
944
Misha Brukman82873732004-05-12 19:21:57 +0000945<pre>
Chris Lattner8ff75902004-01-06 05:31:32 +0000946%lldb.object = type {
947 uint, <i>;; <a href="#format_common_tag">A tag</a></i>
948 <i>any</i>*, <i>;; The <a href="#format_common_object_contexts">context</a> for the object</i>
949 sbyte* <i>;; The object 'name'</i>
Chris Lattnerbdfb3392004-01-05 05:06:33 +0000950}
Misha Brukman82873732004-05-12 19:21:57 +0000951</pre>
Chris Lattnerbdfb3392004-01-05 05:06:33 +0000952
Misha Brukman82873732004-05-12 19:21:57 +0000953<p>The first field contains a tag for the descriptor. The second field contains
Chris Lattner8ff75902004-01-06 05:31:32 +0000954either a pointer to the descriptor for the containing <a
955href="#format_common_source_files">source file</a>, or it contains a pointer to
956another program object whose context pointer eventually reaches a source file.
957Through this <a href="#format_common_object_contexts">context</a> pointer, the
958LLVM debugger can establish the debug version number of the object.</p>
959
Misha Brukman82873732004-05-12 19:21:57 +0000960<p>The third field contains a string that the debugger can use to identify the
Chris Lattner8ff75902004-01-06 05:31:32 +0000961object if it does not contain explicit support for the source-language in use
962(ie, the 'unknown' source language handler uses this string). This should be
963some sort of unmangled string that corresponds to the object, but it is a
964quality of implementation issue what exactly it contains (it is legal, though
Misha Brukman82873732004-05-12 19:21:57 +0000965not useful, for all of these strings to be null).</p>
Chris Lattnerbdfb3392004-01-05 05:06:33 +0000966
Misha Brukman82873732004-05-12 19:21:57 +0000967<p>Note again that descriptors can be extended to include
968source-language-specific information in addition to the fields required by the
969LLVM debugger. See the <a href="#ccxx_descriptors">section on the C/C++
970front-end</a> for more information. Also remember that global objects
971(functions, selectors, global variables, etc) must contain an <a
Misha Brukman179bf4b2004-06-03 23:42:24 +0000972href="#format_common_anchors">anchor</a> to the <tt>llvm.dbg.globals</tt>
Misha Brukman82873732004-05-12 19:21:57 +0000973variable.</p>
Chris Lattnerbdfb3392004-01-05 05:06:33 +0000974</div>
975
976
977<!-- ======================================================================= -->
978<div class="doc_subsection">
Chris Lattner8ff75902004-01-06 05:31:32 +0000979 <a name="format_common_object_contexts">Program object contexts</a>
Chris Lattnerbdfb3392004-01-05 05:06:33 +0000980</div>
981
982<div class="doc_text">
Misha Brukman82873732004-05-12 19:21:57 +0000983<pre>
Chris Lattner8ff75902004-01-06 05:31:32 +0000984Allow source-language specific contexts, use to identify namespaces etc
985Must end up in a source file descriptor.
986Debugger core ignores all unknown context objects.
Misha Brukman82873732004-05-12 19:21:57 +0000987</pre>
Chris Lattner8ff75902004-01-06 05:31:32 +0000988</div>
989
Chris Lattner8ff75902004-01-06 05:31:32 +0000990<!-- ======================================================================= -->
991<div class="doc_subsection">
992 <a name="format_common_intrinsics">Debugger intrinsic functions</a>
993</div>
994
995<div class="doc_text">
Misha Brukman82873732004-05-12 19:21:57 +0000996<pre>
Chris Lattner8ff75902004-01-06 05:31:32 +0000997Define each intrinsics, as an extension of the language reference manual.
998
999llvm.dbg.stoppoint
1000llvm.dbg.region.start
1001llvm.dbg.region.end
1002llvm.dbg.function.start
1003llvm.dbg.declare
Misha Brukman82873732004-05-12 19:21:57 +00001004</pre>
Chris Lattner8ff75902004-01-06 05:31:32 +00001005</div>
1006
Chris Lattner8ff75902004-01-06 05:31:32 +00001007<!-- ======================================================================= -->
1008<div class="doc_subsection">
1009 <a name="format_common_tags">Values for debugger tags</a>
1010</div>
1011
1012<div class="doc_text">
1013
Misha Brukman82873732004-05-12 19:21:57 +00001014<p>Happen to be the same value as the similarly named Dwarf-3 tags, this may
1015change in the future.</p>
Chris Lattnerbdfb3392004-01-05 05:06:33 +00001016
Misha Brukman82873732004-05-12 19:21:57 +00001017<pre>
Chris Lattner8ff75902004-01-06 05:31:32 +00001018 <a name="tag_compile_unit">LLVM_COMPILE_UNIT</a> : 17
1019 <a name="tag_subprogram">LLVM_SUBPROGRAM</a> : 46
1020 <a name="tag_variable">LLVM_VARIABLE</a> : 52
1021<!-- <a name="tag_formal_parameter">LLVM_FORMAL_PARAMETER : 5-->
Misha Brukman82873732004-05-12 19:21:57 +00001022</pre>
Chris Lattnerbdfb3392004-01-05 05:06:33 +00001023</div>
1024
1025
1026
1027<!-- *********************************************************************** -->
1028<div class="doc_section">
Chris Lattner8ff75902004-01-06 05:31:32 +00001029 <a name="ccxx_frontend">C/C++ front-end specific debug information</a>
Chris Lattnerbdfb3392004-01-05 05:06:33 +00001030</div>
Misha Brukman94218a72004-12-09 20:27:37 +00001031<!-- *********************************************************************** -->
Chris Lattnerbdfb3392004-01-05 05:06:33 +00001032
1033<div class="doc_text">
1034
Misha Brukman82873732004-05-12 19:21:57 +00001035<p>The C and C++ front-ends represent information about the program in a format
Chris Lattnerbdfb3392004-01-05 05:06:33 +00001036that is effectively identical to <a
1037href="http://www.eagercon.com/dwarf/dwarf3std.htm">Dwarf 3.0</a> in terms of
1038information content. This allows code generators to trivially support native
1039debuggers by generating standard dwarf information, and contains enough
Chris Lattner8ff75902004-01-06 05:31:32 +00001040information for non-dwarf targets to translate it as needed.</p>
Chris Lattnerbdfb3392004-01-05 05:06:33 +00001041
Misha Brukman82873732004-05-12 19:21:57 +00001042<p>The basic debug information required by the debugger is (intentionally)
1043designed to be as minimal as possible. This basic information is so minimal
1044that it is unlikely that <b>any</b> source-language could be adequately
1045described by it. Because of this, the debugger format was designed for
1046extension to support source-language-specific information. The extended
1047descriptors are read and interpreted by the <a
1048href="#arch_info">language-specific</a> modules in the debugger if there is
1049support available, otherwise it is ignored.</p>
Chris Lattner8ff75902004-01-06 05:31:32 +00001050
Misha Brukman82873732004-05-12 19:21:57 +00001051<p>This section describes the extensions used to represent C and C++ programs.
Chris Lattner8ff75902004-01-06 05:31:32 +00001052Other languages could pattern themselves after this (which itself is tuned to
1053representing programs in the same way that Dwarf 3 does), or they could choose
1054to provide completely different extensions if they don't fit into the Dwarf
1055model. As support for debugging information gets added to the various LLVM
Misha Brukman82873732004-05-12 19:21:57 +00001056source-language front-ends, the information used should be documented here.</p>
Chris Lattnerbdfb3392004-01-05 05:06:33 +00001057
1058</div>
1059
1060<!-- ======================================================================= -->
1061<div class="doc_subsection">
Chris Lattner8ff75902004-01-06 05:31:32 +00001062 <a name="ccxx_pse">Program Scope Entries</a>
Chris Lattnerbdfb3392004-01-05 05:06:33 +00001063</div>
1064
1065<div class="doc_text">
Misha Brukman82873732004-05-12 19:21:57 +00001066<p>TODO</p>
Chris Lattnerbdfb3392004-01-05 05:06:33 +00001067</div>
1068
Misha Brukman82873732004-05-12 19:21:57 +00001069<!-- -------------------------------------------------------------------------->
Chris Lattner8ff75902004-01-06 05:31:32 +00001070<div class="doc_subsubsection">
1071 <a name="ccxx_compilation_units">Compilation unit entries</a>
1072</div>
1073
1074<div class="doc_text">
1075<p>
1076Translation units do not add any information over the standard <a
1077href="#format_common_source_files">source file representation</a> already
1078expected by the debugger. As such, it uses descriptors of the type specified,
1079with a trailing <a href="#format_common_anchors">anchor</a>.
1080</p>
1081</div>
1082
Misha Brukman82873732004-05-12 19:21:57 +00001083<!-- -------------------------------------------------------------------------->
Chris Lattner8ff75902004-01-06 05:31:32 +00001084<div class="doc_subsubsection">
1085 <a name="ccxx_modules">Module, namespace, and importing entries</a>
1086</div>
1087
1088<div class="doc_text">
Misha Brukman82873732004-05-12 19:21:57 +00001089<p>TODO</p>
Chris Lattner8ff75902004-01-06 05:31:32 +00001090</div>
1091
1092<!-- ======================================================================= -->
1093<div class="doc_subsection">
1094 <a name="ccxx_dataobjects">Data objects (program variables)</a>
1095</div>
1096
1097<div class="doc_text">
Misha Brukman82873732004-05-12 19:21:57 +00001098<p>TODO</p>
Chris Lattner8ff75902004-01-06 05:31:32 +00001099</div>
Chris Lattnerbdfb3392004-01-05 05:06:33 +00001100
1101
1102<!-- *********************************************************************** -->
Misha Brukman82873732004-05-12 19:21:57 +00001103
Chris Lattnerbdfb3392004-01-05 05:06:33 +00001104<hr>
Misha Brukman82873732004-05-12 19:21:57 +00001105<address>
1106 <a href="http://jigsaw.w3.org/css-validator/check/referer"><img
1107 src="http://jigsaw.w3.org/css-validator/images/vcss" alt="Valid CSS!"></a>
1108 <a href="http://validator.w3.org/check/referer"><img
1109 src="http://www.w3.org/Icons/valid-html401" alt="Valid HTML 4.01!"></a>
1110
1111 <a href="mailto:sabre@nondot.org">Chris Lattner</a><br>
1112 <a href="http://llvm.cs.uiuc.edu">LLVM Compiler Infrastructure</a><br>
Chris Lattnerbdfb3392004-01-05 05:06:33 +00001113 Last modified: $Date$
Misha Brukman82873732004-05-12 19:21:57 +00001114</address>
Chris Lattnerbdfb3392004-01-05 05:06:33 +00001115
1116</body>
1117</html>