blob: fb7dc573a79cb1227bab3885ceed0cdefe085477 [file] [log] [blame]
Caroline Tice9e933a32011-06-02 23:40:56 +00001<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
2<html xmlns="http://www.w3.org/1999/xhtml">
3<head>
4<meta http-equiv="Content-Type" content="text/html; charset=iso-8859-1" />
5<link href="style.css" rel="stylesheet" type="text/css" />
6<title>LLDB Example - Python Scripting to Debug a Problem</title>
7</head>
8
9<body>
10 <div class="www_title">
11 Example - Using Scripting and Python to Debug in LLDB
12 </div>
13
14<div id="container">
15 <div id="content">
16 <!--#include virtual="sidebar.incl"-->
17 <div id="middle">
18 <div class="post">
19 <h1 class ="postheader">The Test Program and Input</h1>
20 <div class="postcontent">
21
22 <p>We have a simple C program (dictionary.c) that reads in a text file, and
23 stores all the words from the file in a Binary Search Tree, sorted
24 alphabetically. It then enters a loop prompting the user for a word, searching
25 for the word in the tree (using Binary Search), and reporting to the user
26 whether or not it found the word in the tree.</p>
27
28 <p>The input text file we are using to test our program contains the text for
29 William Shakespeare's famous tragedy "Romeo and Juliet".</p>
30
31 </div>
32 <div class="postfooter"></div>
33
34 <div class="post">
35 <h1 class ="postheader">The Bug</h1>
36 <div class="postcontent">
37
38 <p>When we try running our program, we find there is a problem. While it
39 successfully finds some of the words we would expect to find, such as "love"
40 or "sun", it fails to find the word "Romeo", which MUST be in the input text
41 file:</p>
42
43 <code color=#ff0000>
44 % ./dictionary Romeo-and-Juliet.txt<br>
45 Dictionary loaded.<br>
46 Enter search word: love<br>
47 Yes!<br>
48 Enter search word: sun<br>
49 Yes!<br>
50 Enter search word: Romeo<br>
51 No!<br>
52 Enter search word: ^D<br>
53 %<br>
54 </code>
55
56 </div>
57 <div class="postfooter"></div>
58
59
60 <div class="post">
61 <h1 class ="postheader">Is the word in our tree: Using Depth First Search</h1>
62 <div class="postcontent">
63
64 <p>Our first job is to determine if the word "Romeo" actually got inserted into
65 the tree or not. Since "Romeo and Juliet" has thousands of words, trying to
66 examine our binary search tree by hand is completely impractical. Therefore we
67 will write a Python script to search the tree for us. We will write a recursive
68 Depth First Search function that traverses the entire tree searching for a word,
69 and maintaining information about the path from the root of the tree to the
70 current node. If it finds the word in the tree, it returns the path from the
71 root to the node containing the word. This is what our DFS function in Python
72 would look like, with line numbers added for easy reference in later
73 explanations:</p>
74
75 <code>
76<pre><tt>
77 1: def DFS (root, word, cur_path):
78 2: root_word_ptr = root.GetChildMemberWithName ("word")
79 3: left_child_ptr = root.GetChildMemberWithName ("left")
80 4: right_child_ptr = root.GetChildMemberWithName ("right")
81 5: root_word = root_word_ptr.GetSummary()
82 6: end = len (root_word) - 1
83 7: if root_word[0] == '"' and root_word[end] == '"':
84 8: root_word = root_word[1:end]
85 9: end = len (root_word) - 1
8610: if root_word[0] == '\'' and root_word[end] == '\'':
8711: root_word = root_word[1:end]
8812: if root_word == word:
8913: return cur_path
9014: elif word < root_word:
9115: if left_child_ptr.GetValue() == None:
9216: return ""
9317: else:
9418: cur_path = cur_path + "L"
9519: return DFS (left_child_ptr, word, cur_path)
9620: else:
9721: if right_child_ptr.GetValue() == None:
9822: return ""
9923: else:
10024: cur_path = cur_path + "R"
10125: return DFS (right_child_ptr, word, cur_path)
102</tt></pre>
103 </code>
104
105 </div>
106 <div class="postfooter"></div>
107
108
109 <div class="post">
110 <h1 class ="postheader"><a name="accessing-variables">Accessing & Manipulating <strong>Program</strong> Variables in Python</a>
111</h1>
112 <div class="postcontent">
113
114 <p>Before we can call any Python function on any of our program's variables, we
115 need to get the variable into a form that Python can access. To show you how to
116 do this we will look at the parameters for the DFS function. The first
117 parameter is going to be a node in our binary search tree, put into a Python
118 variable. The second parameter is the word we are searching for (a string), and
119 the third parameter is a string representing the path from the root of the tree
120 to our current node.</p>
121
122 <p>The most interesting parameter is the first one, the Python variable that
123 needs to contain a node in our search tree. How can we take a variable out of
124 our program and put it into a Python variable? What kind of Python variable
125 will it be? The answers are to use the LLDB API functions, provided as part of
126 the LLDB Python module. Running Python from inside LLDB, LLDB will
127 automatically give us our current frame object as a Python variable,
128 "lldb.frame". This variable has the type "SBFrame" (see the LLDB API for
129 more information about SBFrame objects). One of the things we can do with a
130 frame object, is to ask it to find and return its local variable. We will call
131 the API function "FindVariable" on the lldb.frame object to give us our
132 dictionary variable as a Python variable:</p>
133
134 <code>
135 root = lldb.frame.FindVariable ("dictionary")
136 </code>
137
138 <p>The line above, executed in the Python script interpreter in LLDB, asks the
139 current frame to find the variable named "dictionary" and return it. We then
140 store the returned value in the Python variable named "root". This answers the
141 question of HOW to get the variable, but it still doesn't explain WHAT actually
142 gets put into "root". If you examine the LLDB API, you will find that the
143 SBFrame method "FindVariable" returns an object of type SBValue. SBValue
144 objects are used, among other things, to wrap up program variables and values.
145 There are many useful methods defined in the SBValue class to allow you to get
146 information or children values out of SBValues. For complete information, see
147 the header file <a href="http://llvm.org/svn/llvm-project/lldb/trunk/include/lldb/API/SBValue.h">SBValue.h</a>. The
148 SBValue methods that we use in our DFS function are
149 <code>GetChildMemberWithName()</code>,
150 <code>GetSummary()</code>, and <code>GetValue()</code>.</p>
151
152 </div>
153 <div class="postfooter"></div>
154
155
156 <div class="post">
157 <h1 class ="postheader">Explaining Depth First Search Script in Detail</h1>
158 <div class="postcontent">
159
160 <p><strong>"DFS" Overview.</strong> Before diving into the details of this
161 code, it would be best to give a high-level overview of what it does. The nodes
162 in our binary search tree were defined to have type <code>tree_node *</code>,
163 which is defined as:
164
165 <code>
166<pre><tt>typedef struct tree_node
167{
168 const char *word;
169 struct tree_node *left;
170 struct tree_node *right;
171} tree_node;</tt></pre></code>
172
173 <p>Lines 2-11 of DFS are getting data out of the current tree node and getting
174 ready to do the actual search; lines 12-25 are the actual depth-first search.
175 Lines 2-4 of our DFS function get the <code>word</code>, <code>left</code> and
176 <code>right</code> fields out of the current node and store them in Python
177 variables. Since <code>root_word_ptr</code> is a pointer to our word, and we
178 want the actual word, line 5 calls <code>GetSummary()</code> to get a string
179 containing the value out of the pointer. Since <code>GetSummary()</code> adds
180 quotes around its result, lines 6-11 strip surrounding quotes off the word.</p>
181
182 <p>Line 12 checks to see if the word in the current node is the one we are
183 searching for. If so, we are done, and line 13 returns the current path.
184 Otherwise, line 14 checks to see if we should go left (search word comes before
185 the current word). If we decide to go left, line 15 checks to see if the left
186 pointer child is NULL ("None" is the Python equivalent of NULL). If the left
187 pointer is NULL, then the word is not in this tree and we return an empty path
188 (line 16). Otherwise, we add an "L" to the end of our current path string, to
189 indicate we are going left (line 18), and then recurse on the left child (line
190 19). Lines 20-25 are the same as lines 14-19, except for going right rather
191 than going left.</p>
192
193 <p>One other note: Typing something as long as our DFS function directly into
194 the interpreter can be difficult, as making a single typing mistake means having
195 to start all over. Therefore we recommend doing as we have done: Writing your
196 longer, more complicated script functions in a separate file (in this case
197 tree_utils.py) and then importing it into your LLDB Python interpreter.</p>
198
199 </div>
200 <div class="postfooter"></div>
201
202
203 <div class="post">
204 <h1 class ="postheader">Seeing the DFS Script in Action</h1>
205 <div class="postcontent">
206
207
208 <p>At this point we are ready to use the DFS function to see if the word "Romeo"
209 is in our tree or not. To actually use it in LLDB on our dictionary program,
210 you would do something like this:</p>
211
212 <code>
Caroline Tice23c36bf2011-06-03 16:38:17 +0000213 % <strong>lldb</strong><br>
214 (lldb) <strong>process attach -n "dictionary"</strong><br>
Caroline Tice9e933a32011-06-02 23:40:56 +0000215 Architecture set to: x86_64.<br>
216 Process 521 stopped<br>
217 * thread #1: tid = 0x2c03, 0x00007fff86c8bea0 libSystem.B.dylib`read$NOCANCEL + 8, stop reason = signal SIGSTOP<br>
218 frame #0: 0x00007fff86c8bea0 libSystem.B.dylib`read$NOCANCEL + 8<br>
Caroline Tice23c36bf2011-06-03 16:38:17 +0000219 (lldb) <strong>breakpoint set -n find_word</strong><br>
Caroline Tice9e933a32011-06-02 23:40:56 +0000220 Breakpoint created: 1: name = 'find_word', locations = 1, resolved = 1<br>
Caroline Tice23c36bf2011-06-03 16:38:17 +0000221 (lldb) <strong>continue</strong><br>
Caroline Tice9e933a32011-06-02 23:40:56 +0000222 Process 521 resuming<br>
223 Process 521 stopped<br>
224 * thread #1: tid = 0x2c03, 0x0000000100001830 dictionary`find_word + 16 <br>
225 at dictionary.c:105, stop reason = breakpoint 1.1<br>
226 frame #0: 0x0000000100001830 dictionary`find_word + 16 at dictionary.c:105<br>
227 102 int<br>
228 103 find_word (tree_node *dictionary, char *word)<br>
229 104 {<br>
230 -> 105 if (!word || !dictionary)<br>
231 106 return 0;<br>
232 107 <br>
233 108 int compare_value = strcmp (word, dictionary->word);<br>
Caroline Tice23c36bf2011-06-03 16:38:17 +0000234 (lldb) <strong>script</strong><br>
Caroline Tice9e933a32011-06-02 23:40:56 +0000235 Python Interactive Interpreter. To exit, type 'quit()', 'exit()' or Ctrl-D.<br>
Caroline Tice23c36bf2011-06-03 16:38:17 +0000236 >>> <strong>import tree_utils</strong><br>
237 >>> <strong>root = lldb.frame.FindVariable ("dictionary")</strong><br>
238 >>> <strong>current_path = ""</strong><br>
239 >>> <strong>path = tree_utils.DFS (root, "Romeo", current_path)</strong><br>
240 >>> <strong>print path</strong><br>
Caroline Tice9e933a32011-06-02 23:40:56 +0000241 LLRRL<br>
Caroline Tice23c36bf2011-06-03 16:38:17 +0000242 >>> <strong>^D</strong><br>
Caroline Tice9e933a32011-06-02 23:40:56 +0000243 (lldb) <br>
244 </code>
245
246 <p>The first bit of code above shows starting lldb, attaching to the dictionary
247 program, and getting to the find_word function in LLDB. The interesting part
248 (as far as this example is concerned) begins when we enter the
249 <code>script</code> command and drop into the embedded interactive Python
250 interpreter. We will go over this Python code line by line. The first line</p>
251
252 <code>
253 import tree_utils
254 </code>
255
Caroline Tice23c36bf2011-06-03 16:38:17 +0000256 <p>imports the file where we wrote our DFS function, tree_utils.py, into Python.
Caroline Tice9e933a32011-06-02 23:40:56 +0000257 Notice that to import the file we leave off the ".py" extension. We can now
258 call any function in that file, giving it the prefix "tree_utils.", so that
259 Python knows where to look for the function. The line</p>
260
261 <code>
262 root = lldb.frame.FindVariable ("dictionary")
263 </code>
264
265 <p>gets our program variable "dictionary" (which contains the binary search
266 tree) and puts it into the Python variable "root". See
267 <a href="#accessing-variables">Accessing & Manipulating Program Variables in Python</a>
268 above for more details about how this works. The next line is</p>
269
270 <code>
271 current_path = ""
272 </code>
273
274 <p>This line initializes the current_path from the root of the tree to our
275 current node. Since we are starting at the root of the tree, our current path
276 starts as an empty string. As we go right and left through the tree, the DFS
277 function will append an 'R' or an 'L' to the current path, as appropriate. The
278 line</p>
279
280 <code>
281 path = tree_utils.DFS (root, "Romeo", current_path)
282 </code>
283
284 <p>calls our DFS function (prefixing it with the module name so that Python can
285 find it). We pass in our binary tree stored in the variable <code>root</code>,
286 the word we are searching for, and our current path. We assign whatever path
287 the DFS function returns to the Python variable <code>path</code>.</p>
288
289
290 <p>Finally, we want to see if the word was found or not, and if so we want to
291 see the path through the tree to the word. So we do</p>
292
293 <code>
294 print path
295 </code>
296
297 <p>From this we can see that the word "Romeo" was indeed found in the tree, and
298 the path from the root of the tree to the node containing "Romeo" is
299 left-left-right-right-left.</p>
300
301 </div>
302 <div class="postfooter"></div>
303
304
305 <div class="post">
306 <h1 class ="postheader">What next? Using Breakpoint Command Scripts...</h1>
307 <div class="postcontent">
308
309 <p>We are halfway to figuring out what the problem is. We know the word we are
310 looking for is in the binary tree, and we know exactly where it is in the binary
311 tree. Now we need to figure out why our binary search algorithm is not finding
312 the word. We will do this using breakpoint command scripts.</p>
313
314
315 <p>The idea is as follows. The binary search algorithm has two main decision
316 points: the decision to follow the right branch; and, the decision to follow
317 the left branch. We will set a breakpoint at each of these decision points, and
318 attach a Python breakpoint command script to each breakpoint. The breakpoint
319 commands will use the global <code>path</code> Python variable that we got from
320 our DFS function. Each time one of these decision breakpoints is hit, the script
321 will compare the actual decision with the decision the front of the
322 <code>path</code> variable says should be made (the first character of the
323 path). If the actual decision and the path agree, then the front character is
324 stripped off the path, and execution is resumed. In this case the user never
325 even sees the breakpoint being hit. But if the decision differs from what the
326 path says it should be, then the script prints out a message and does NOT resume
327 execution, leaving the user sitting at the first point where a wrong decision is
328 being made.</p>
329
330 </div>
331 <div class="postfooter"></div>
332
333
334 <div class="post">
335 <h1 class ="postheader">Side Note: Python Breakpoint Command Scripts are NOT What They Seem</h1>
336 <div class="postcontent">
337
338 </div>
339 <div class="postfooter"></div>
340
341 <p>What do we mean by that? When you enter a Python breakpoint command in LLDB,
342 it appears that you are entering one or more plain lines of Python. BUT LLDB
343 then takes what you entered and wraps it into a Python FUNCTION (just like using
344 the "def" Python command). It automatically gives the function an obscure,
345 unique, hard-to-stumble-across function name, and gives it two parameters:
346 <code>frame</code> and <code>bp_loc</code>. When the breakpoint gets hit, LLDB
347 wraps up the frame object where the breakpoint was hit, and the breakpoint
348 location object for the breakpoint that was hit, and puts them into Python
349 variables for you. It then calls the Python function that was created for the
350 breakpoint command, and passes in the frame and breakpoint location objects.</p>
351
352 <p>So, being practical, what does this mean for you when you write your Python
353 breakpoint commands? It means that there are two things you need to keep in
354 mind: 1. If you want to access any Python variables created outside your script,
355 <strong>you must declare such variables to be global</strong>. If you do not
356 declare them as global, then the Python function will treat them as local
357 variables, and you will get unexpected behavior. 2. <strong>All Python
358 breakpoint command scripts automatically have a <code>frame</code> and a
359 <code>bp_loc</code> variable.</strong> The variables are pre-loaded by LLDB
360 with the correct context for the breakpoint. You do not have to use these
361 variables, but they are there if you want them.</p>
362
363 </div>
364 <div class="postfooter"></div>
365
366
367 <div class="post">
368 <h1 class ="postheader">The Decision Point Breakpoint Commands</h1>
369 <div class="postcontent">
370
371 <p>This is what the Python breakpoint command script would look like for the
372 decision to go right:<p>
373
374<code><pre><tt>
375global path
376if path[0] == 'R':
377 path = path[1:]
378 thread = frame.GetThread()
379 process = thread.GetProcess()
380 process.Continue()
381else:
382 print "Here is the problem; going right, should go left!"
383</tt></pre></code>
384
385 <p>Just as a reminder, LLDB is going to take this script and wrap it up in a
386 function, like this:</p>
387
388<code><pre><tt>
389def some_unique_and_obscure_function_name (frame, bp_loc):
390 global path
391 if path[0] == 'R':
392 path = path[1:]
393 thread = frame.GetThread()
394 process = thread.GetProcess()
395 process.Continue()
396 else:
397 print "Here is the problem; going right, should go left!"
398</tt></pre></code>
399
400 <p>LLDB will call the function, passing in the correct frame and breakpoint
401 location whenever the breakpoint gets hit. There are several things to notice
402 about this function. The first one is that we are accessing and updating a
403 piece of state (the <code>path</code> variable), and actually conditioning our
404 behavior based upon this variable. Since the variable was defined outside of
405 our script (and therefore outside of the corresponding function) we need to tell
406 Python that we are accessing a global variable. That is what the first line of
407 the script does. Next we check where the path says we should go and compare it to
408 our decision (recall that we are at the breakpoint for the decision to go
409 right). If the path agrees with our decision, then we strip the first character
410 off of the path.</p>
411
412 <p>Since the decision matched the path, we want to resume execution. To do this
413 we make use of the <code>frame</code> parameter that LLDB guarantees will be
414 there for us. We use LLDB API functions to get the current thread from the
415 current frame, and then to get the process from the thread. Once we have the
416 process, we tell it to resume execution (using the <code>Continue()</code> API
417 function).</p>
418
419 <p>If the decision to go right does not agree with the path, then we do not
420 resume execution. We allow the breakpoint to remain stopped (by doing nothing),
421 and we print an informational message telling the user we have found the
422 problem, and what the problem is.</p>
423
424 </div>
425 <div class="postfooter"></div>
426
427 <div class="post">
428 <h1 class ="postheader">Actually Using the Breakpoint Commands</h1>
429 <div class="postcontent">
430
431 <p>Now we will look at what happens when we actually use these breakpoint
432 commands on our program. Doing a <code>source list -n find_word</code> shows
433 us the function containing our two decision points. Looking at the code below,
434 we see that we want to set our breakpoints on lines 113 and 115:</p>
435
436<code><pre><tt>
437(lldb) source list -n find_word
438File: /Volumes/Data/HD2/carolinetice/Desktop/LLDB-Web-Examples/dictionary.c.
439101
440102 int
441103 find_word (tree_node *dictionary, char *word)
442104 {
443105 if (!word || !dictionary)
444106 return 0;
445107
446108 int compare_value = strcmp (word, dictionary->word);
447109
448110 if (compare_value == 0)
449111 return 1;
450112 else if (compare_value < 0)
451113 return find_word (dictionary->left, word);
452114 else
453115 return find_word (dictionary->right, word);
454116 }
455117
456</tt></pre></code>
457
458 <p>So, we set our breakpoints, enter our breakpoint command scripts, and see
459 what happens:<p>
460
461<code><pre><tt>
462(lldb) breakpoint set -l 113
463Breakpoint created: 2: file ='dictionary.c', line = 113, locations = 1, resolved = 1
464(lldb) breakpoint set -l 115
465Breakpoint created: 3: file ='dictionary.c', line = 115, locations = 1, resolved = 1
466(lldb) breakpoint command add -s python 2
467Enter your Python command(s). Type 'DONE' to end.
468> global path
469> if (path[0] == 'L'):
470> path = path[1:]
471> thread = frame.GetThread()
472> process = thread.GetProcess()
473> process.Continue()
474> else:
475> print "Here is the problem. Going left, should go right!"
476> DONE
477(lldb) breakpoint command add -s python 3
478Enter your Python command(s). Type 'DONE' to end.
479> global path
480> if (path[0] == 'R'):
481> path = path[1:]
482> thread = frame.GetThread()
483> process = thread.GetProcess()
484> process.Continue()
485> else:
486> print "Here is the problem. Going right, should go left!"
487> DONE
488(lldb) continue
489Process 696 resuming
490Here is the problem. Going right, should go left!
491Process 696 stopped
492* thread #1: tid = 0x2d03, 0x000000010000189f dictionary`find_word + 127 at dictionary.c:115, stop reason = breakpoint 3.1
493 frame #0: 0x000000010000189f dictionary`find_word + 127 at dictionary.c:115
494 112 else if (compare_value < 0)
495 113 return find_word (dictionary->left, word);
496 114 else
497 -> 115 return find_word (dictionary->right, word);
498 116 }
499 117
500 118 void
501(lldb)
502</tt></pre></code>
503
504
505 <p>After setting our breakpoints, adding our breakpoint commands and continuing,
506 we run for a little bit and then hit one of our breakpoints, printing out the
507 error message from the breakpoint command. Apparently at this point the the
508 tree, our search algorithm decided to go right, but our path says the node we
509 want is to the left. Examining the word at the node where we stopped, and our
510 search word, we see:</p>
511
512 <code>
513 (lldb) expr dictionary->word<br>
514 (const char *) $1 = 0x0000000100100080 "dramatis"<br>
515 (lldb) expr word<br>
516 (char *) $2 = 0x00007fff5fbff108 "romeo"<br>
517 </code>
518
519 <p>So the word at our current node is "dramatis", and the word we are searching
520 for is "romeo". "romeo" comes after "dramatis" alphabetically, so it seems like
521 going right would be the correct decision. Let's ask Python what it thinks the
522 path from the current node to our word is:</p>
523
524 <code>
525 (lldb) script print path<br>
526 LLRRL<br>
527 </code>
528
529 <p>According to Python we need to go left-left-right-right-left from our current
530 node to find the word we are looking for. Let's double check our tree, and see
531 what word it has at that node:</p>
532
533 <code>
534 (lldb) expr dictionary->left->left->right->right->left->word<br>
535 (const char *) $4 = 0x0000000100100880 "Romeo"<br>
536 </code>
537
538 <p>So the word we are searching for is "romeo" and the word at our DFS location
539 is "Romeo". Aha! One is uppercase and the other is lowercase: We seem to have
540 a case conversion problem somewhere in our program (we do).</p>
541
542 <p>This is the end of our example on how you might use Python scripting in LLDB
543 to help you find bugs in your program.</p>
544
545 </div>
546 <div class="postfooter"></div>
547
548 <div class="post">
549 <h1 class ="postheader">Source Files for The Example</h1>
550 <div class="postcontent">
551
552
553 </div>
554 <div class="postfooter"></div>
555
556 <p> The complete code for the Dictionary program (with case-conversion bug),
557 the DFS function and other Python script examples (tree_utils.py) used for this
558 example are available via following file links:</p>
559
560<a href="http://llvm.org/svn/llvm-project/lldb/trunk/examples/scripting/tree_utils.py">tree_utils.py</a> - Example Python functions using LLDB's API, including DFS<br>
561<a href="http://llvm.org/svn/llvm-project/lldb/trunk/examples/scripting/dictionary.c">dictionary.c</a> - Sample dictionary program, with bug<br>
562
563 <p>The text for "Romeo and Juliet" can be obtained from the Gutenberg Project
564 (http://www.gutenberg.org).</p>
565 </div>
566 </div>
567 </div>
568</div>
569</body>
570</html>