| <!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01//EN" |
| "http://www.w3.org/TR/html4/strict.dtd"> |
| <html> |
| <head> |
| <meta http-equiv="Content-Type" content="text/html; charset=utf-8"> |
| <title>LLVM bugpoint tool: design and usage</title> |
| <link rel="stylesheet" href="_static/llvm.css" type="text/css"> |
| </head> |
| |
| <h1> |
| LLVM bugpoint tool: design and usage |
| </h1> |
| |
| <ul> |
| <li><a href="#desc">Description</a></li> |
| <li><a href="#design">Design Philosophy</a> |
| <ul> |
| <li><a href="#autoselect">Automatic Debugger Selection</a></li> |
| <li><a href="#crashdebug">Crash debugger</a></li> |
| <li><a href="#codegendebug">Code generator debugger</a></li> |
| <li><a href="#miscompilationdebug">Miscompilation debugger</a></li> |
| </ul></li> |
| <li><a href="#advice">Advice for using <tt>bugpoint</tt></a></li> |
| <li><a href="#notEnough">What to do when <tt>bugpoint</tt> isn't enough</a></li> |
| </ul> |
| |
| <div class="doc_author"> |
| <p>Written by <a href="mailto:sabre@nondot.org">Chris Lattner</a></p> |
| </div> |
| |
| <!-- *********************************************************************** --> |
| <h2> |
| <a name="desc">Description</a> |
| </h2> |
| <!-- *********************************************************************** --> |
| |
| <div> |
| |
| <p><tt>bugpoint</tt> narrows down the source of problems in LLVM tools and |
| passes. It can be used to debug three types of failures: optimizer crashes, |
| miscompilations by optimizers, or bad native code generation (including problems |
| in the static and JIT compilers). It aims to reduce large test cases to small, |
| useful ones. For example, if <tt>opt</tt> crashes while optimizing a |
| file, it will identify the optimization (or combination of optimizations) that |
| causes the crash, and reduce the file down to a small example which triggers the |
| crash.</p> |
| |
| <p>For detailed case scenarios, such as debugging <tt>opt</tt>, or one of the |
| LLVM code generators, see <a href="HowToSubmitABug.html">How To Submit a Bug |
| Report document</a>.</p> |
| |
| </div> |
| |
| <!-- *********************************************************************** --> |
| <h2> |
| <a name="design">Design Philosophy</a> |
| </h2> |
| <!-- *********************************************************************** --> |
| |
| <div> |
| |
| <p><tt>bugpoint</tt> is designed to be a useful tool without requiring any |
| hooks into the LLVM infrastructure at all. It works with any and all LLVM |
| passes and code generators, and does not need to "know" how they work. Because |
| of this, it may appear to do stupid things or miss obvious |
| simplifications. <tt>bugpoint</tt> is also designed to trade off programmer |
| time for computer time in the compiler-debugging process; consequently, it may |
| take a long period of (unattended) time to reduce a test case, but we feel it |
| is still worth it. Note that <tt>bugpoint</tt> is generally very quick unless |
| debugging a miscompilation where each test of the program (which requires |
| executing it) takes a long time.</p> |
| |
| <!-- ======================================================================= --> |
| <h3> |
| <a name="autoselect">Automatic Debugger Selection</a> |
| </h3> |
| |
| <div> |
| |
| <p><tt>bugpoint</tt> reads each <tt>.bc</tt> or <tt>.ll</tt> file specified on |
| the command line and links them together into a single module, called the test |
| program. If any LLVM passes are specified on the command line, it runs these |
| passes on the test program. If any of the passes crash, or if they produce |
| malformed output (which causes the verifier to abort), <tt>bugpoint</tt> starts |
| the <a href="#crashdebug">crash debugger</a>.</p> |
| |
| <p>Otherwise, if the <tt>-output</tt> option was not specified, |
| <tt>bugpoint</tt> runs the test program with the C backend (which is assumed to |
| generate good code) to generate a reference output. Once <tt>bugpoint</tt> has |
| a reference output for the test program, it tries executing it with the |
| selected code generator. If the selected code generator crashes, |
| <tt>bugpoint</tt> starts the <a href="#crashdebug">crash debugger</a> on the |
| code generator. Otherwise, if the resulting output differs from the reference |
| output, it assumes the difference resulted from a code generator failure, and |
| starts the <a href="#codegendebug">code generator debugger</a>.</p> |
| |
| <p>Finally, if the output of the selected code generator matches the reference |
| output, <tt>bugpoint</tt> runs the test program after all of the LLVM passes |
| have been applied to it. If its output differs from the reference output, it |
| assumes the difference resulted from a failure in one of the LLVM passes, and |
| enters the <a href="#miscompilationdebug">miscompilation debugger</a>. |
| Otherwise, there is no problem <tt>bugpoint</tt> can debug.</p> |
| |
| </div> |
| |
| <!-- ======================================================================= --> |
| <h3> |
| <a name="crashdebug">Crash debugger</a> |
| </h3> |
| |
| <div> |
| |
| <p>If an optimizer or code generator crashes, <tt>bugpoint</tt> will try as hard |
| as it can to reduce the list of passes (for optimizer crashes) and the size of |
| the test program. First, <tt>bugpoint</tt> figures out which combination of |
| optimizer passes triggers the bug. This is useful when debugging a problem |
| exposed by <tt>opt</tt>, for example, because it runs over 38 passes.</p> |
| |
| <p>Next, <tt>bugpoint</tt> tries removing functions from the test program, to |
| reduce its size. Usually it is able to reduce a test program to a single |
| function, when debugging intraprocedural optimizations. Once the number of |
| functions has been reduced, it attempts to delete various edges in the control |
| flow graph, to reduce the size of the function as much as possible. Finally, |
| <tt>bugpoint</tt> deletes any individual LLVM instructions whose absence does |
| not eliminate the failure. At the end, <tt>bugpoint</tt> should tell you what |
| passes crash, give you a bitcode file, and give you instructions on how to |
| reproduce the failure with <tt>opt</tt> or <tt>llc</tt>.</p> |
| |
| </div> |
| |
| <!-- ======================================================================= --> |
| <h3> |
| <a name="codegendebug">Code generator debugger</a> |
| </h3> |
| |
| <div> |
| |
| <p>The code generator debugger attempts to narrow down the amount of code that |
| is being miscompiled by the selected code generator. To do this, it takes the |
| test program and partitions it into two pieces: one piece which it compiles |
| with the C backend (into a shared object), and one piece which it runs with |
| either the JIT or the static LLC compiler. It uses several techniques to |
| reduce the amount of code pushed through the LLVM code generator, to reduce the |
| potential scope of the problem. After it is finished, it emits two bitcode |
| files (called "test" [to be compiled with the code generator] and "safe" [to be |
| compiled with the C backend], respectively), and instructions for reproducing |
| the problem. The code generator debugger assumes that the C backend produces |
| good code.</p> |
| |
| </div> |
| |
| <!-- ======================================================================= --> |
| <h3> |
| <a name="miscompilationdebug">Miscompilation debugger</a> |
| </h3> |
| |
| <div> |
| |
| <p>The miscompilation debugger works similarly to the code generator debugger. |
| It works by splitting the test program into two pieces, running the |
| optimizations specified on one piece, linking the two pieces back together, and |
| then executing the result. It attempts to narrow down the list of passes to |
| the one (or few) which are causing the miscompilation, then reduce the portion |
| of the test program which is being miscompiled. The miscompilation debugger |
| assumes that the selected code generator is working properly.</p> |
| |
| </div> |
| |
| </div> |
| |
| <!-- *********************************************************************** --> |
| <h2> |
| <a name="advice">Advice for using bugpoint</a> |
| </h2> |
| <!-- *********************************************************************** --> |
| |
| <div> |
| |
| <tt>bugpoint</tt> can be a remarkably useful tool, but it sometimes works in |
| non-obvious ways. Here are some hints and tips:<p> |
| |
| <ol> |
| <li>In the code generator and miscompilation debuggers, <tt>bugpoint</tt> only |
| works with programs that have deterministic output. Thus, if the program |
| outputs <tt>argv[0]</tt>, the date, time, or any other "random" data, |
| <tt>bugpoint</tt> may misinterpret differences in these data, when output, |
| as the result of a miscompilation. Programs should be temporarily modified |
| to disable outputs that are likely to vary from run to run. |
| |
| <li>In the code generator and miscompilation debuggers, debugging will go |
| faster if you manually modify the program or its inputs to reduce the |
| runtime, but still exhibit the problem. |
| |
| <li><tt>bugpoint</tt> is extremely useful when working on a new optimization: |
| it helps track down regressions quickly. To avoid having to relink |
| <tt>bugpoint</tt> every time you change your optimization however, have |
| <tt>bugpoint</tt> dynamically load your optimization with the |
| <tt>-load</tt> option. |
| |
| <li><p><tt>bugpoint</tt> can generate a lot of output and run for a long period |
| of time. It is often useful to capture the output of the program to file. |
| For example, in the C shell, you can run:</p> |
| |
| <div class="doc_code"> |
| <p><tt>bugpoint ... |& tee bugpoint.log</tt></p> |
| </div> |
| |
| <p>to get a copy of <tt>bugpoint</tt>'s output in the file |
| <tt>bugpoint.log</tt>, as well as on your terminal.</p> |
| |
| <li><tt>bugpoint</tt> cannot debug problems with the LLVM linker. If |
| <tt>bugpoint</tt> crashes before you see its "All input ok" message, |
| you might try <tt>llvm-link -v</tt> on the same set of input files. If |
| that also crashes, you may be experiencing a linker bug. |
| |
| <li><tt>bugpoint</tt> is useful for proactively finding bugs in LLVM. |
| Invoking <tt>bugpoint</tt> with the <tt>-find-bugs</tt> option will cause |
| the list of specified optimizations to be randomized and applied to the |
| program. This process will repeat until a bug is found or the user |
| kills <tt>bugpoint</tt>. |
| </ol> |
| |
| </div> |
| <!-- *********************************************************************** --> |
| <h2> |
| <a name="notEnough">What to do when bugpoint isn't enough</a> |
| </h2> |
| <!-- *********************************************************************** --> |
| |
| <div> |
| |
| <p>Sometimes, <tt>bugpoint</tt> is not enough. In particular, InstCombine and |
| TargetLowering both have visitor structured code with lots of potential |
| transformations. If the process of using bugpoint has left you with |
| still too much code to figure out and the problem seems |
| to be in instcombine, the following steps may help. These same techniques |
| are useful with TargetLowering as well.</p> |
| |
| <p>Turn on <tt>-debug-only=instcombine</tt> and see which transformations |
| within instcombine are firing by selecting out lines with "<tt>IC</tt>" |
| in them.</p> |
| |
| <p>At this point, you have a decision to make. Is the number |
| of transformations small enough to step through them using a debugger? |
| If so, then try that.</p> |
| |
| <p>If there are too many transformations, then a source modification |
| approach may be helpful. |
| In this approach, you can modify the source code of instcombine |
| to disable just those transformations that are being performed on your |
| test input and perform a binary search over the set of transformations. |
| One set of places to modify are the "<tt>visit*</tt>" methods of |
| <tt>InstCombiner</tt> (<I>e.g.</I> <tt>visitICmpInst</tt>) by adding a |
| "<tt>return false</tt>" as the first line of the method.</p> |
| |
| <p>If that still doesn't remove enough, then change the caller of |
| <tt>InstCombiner::DoOneIteration</tt>, <tt>InstCombiner::runOnFunction</tt> |
| to limit the number of iterations.</p> |
| |
| <p>You may also find it useful to use "<tt>-stats</tt>" now to see what parts |
| of instcombine are firing. This can guide where to put additional reporting |
| code.</p> |
| |
| <p>At this point, if the amount of transformations is still too large, then |
| inserting code to limit whether or not to execute the body of the code |
| in the visit function can be helpful. Add a static counter which is |
| incremented on every invocation of the function. Then add code which |
| simply returns false on desired ranges. For example:</p> |
| |
| <div class="doc_code"> |
| <p><tt>static int calledCount = 0;</tt></p> |
| <p><tt>calledCount++;</tt></p> |
| <p><tt>DEBUG(if (calledCount < 212) return false);</tt></p> |
| <p><tt>DEBUG(if (calledCount > 217) return false);</tt></p> |
| <p><tt>DEBUG(if (calledCount == 213) return false);</tt></p> |
| <p><tt>DEBUG(if (calledCount == 214) return false);</tt></p> |
| <p><tt>DEBUG(if (calledCount == 215) return false);</tt></p> |
| <p><tt>DEBUG(if (calledCount == 216) return false);</tt></p> |
| <p><tt>DEBUG(dbgs() << "visitXOR calledCount: " << calledCount |
| << "\n");</tt></p> |
| <p><tt>DEBUG(dbgs() << "I: "; I->dump());</tt></p> |
| </div> |
| |
| <p>could be added to <tt>visitXOR</tt> to limit <tt>visitXor</tt> to being |
| applied only to calls 212 and 217. This is from an actual test case and raises |
| an important point---a simple binary search may not be sufficient, as |
| transformations that interact may require isolating more than one call. |
| In TargetLowering, use <tt>return SDNode();</tt> instead of |
| <tt>return false;</tt>.</p> |
| |
| <p>Now that that the number of transformations is down to a manageable |
| number, try examining the output to see if you can figure out which |
| transformations are being done. If that can be figured out, then |
| do the usual debugging. If which code corresponds to the transformation |
| being performed isn't obvious, set a breakpoint after the call count |
| based disabling and step through the code. Alternatively, you can use |
| "printf" style debugging to report waypoints.</p> |
| |
| </div> |
| |
| <!-- *********************************************************************** --> |
| |
| <hr> |
| <address> |
| <a href="http://jigsaw.w3.org/css-validator/check/referer"><img |
| src="http://jigsaw.w3.org/css-validator/images/vcss-blue" alt="Valid CSS"></a> |
| <a href="http://validator.w3.org/check/referer"><img |
| src="http://www.w3.org/Icons/valid-html401-blue" alt="Valid HTML 4.01"></a> |
| |
| <a href="mailto:sabre@nondot.org">Chris Lattner</a><br> |
| <a href="http://llvm.org/">LLVM Compiler Infrastructure</a><br> |
| Last modified: $Date$ |
| </address> |
| |
| </body> |
| </html> |