Brian Gaeke | ff0c766 | 2004-07-01 20:29:08 +0000 | [diff] [blame^] | 1 | =pod |
| 2 | |
| 3 | =head1 NAME |
| 4 | |
| 5 | bugpoint - automatic test case reduction tool |
| 6 | |
| 7 | =head1 SYNOPSIS |
| 8 | |
| 9 | bugpoint [options] [input LLVM ll/bc files] [LLVM passes] --args |
| 10 | I<program arguments> ... |
| 11 | |
| 12 | =head1 DESCRIPTION |
| 13 | |
| 14 | The B<bugpoint> tool narrows down the source of problems in LLVM tools and passes. |
| 15 | It can be used to debug three types of failures: optimizer crashes, |
| 16 | miscompilations by optimizers, or bad native code generation (including problems |
| 17 | in the static and JIT compilers). It aims to reduce large test cases to small, |
| 18 | useful ones. For example, if B<gccas> crashes while optimizing a file, it will |
| 19 | identify the optimization (or combination of optimizations) that causes the |
| 20 | crash, and reduce the file down to a small example which triggers the crash. |
| 21 | |
| 22 | =head2 Design Philosophy |
| 23 | |
| 24 | B<bugpoint> is designed to be a useful tool without requiring any hooks into the |
| 25 | LLVM infrastructure at all. It works with any and all LLVM passes and code |
| 26 | generators, and does not need to "know" how they work. Because of this, it may |
| 27 | appear to do stupid things or miss obvious simplifications. B<bugpoint> is also |
| 28 | designed to trade off programmer time for computer time in the |
| 29 | compiler-debugging process; consequently, it may take a long period of |
| 30 | (unattended) time to reduce a test case, but we feel it is still worth it. Note |
| 31 | that B<bugpoint> is generally very quick unless debugging a miscompilation where |
| 32 | each test of the program (which requires executing it) takes a long time. |
| 33 | |
| 34 | =head2 Automatic Debugger Selection |
| 35 | |
| 36 | B<bugpoint> reads each F<.bc> or F<.ll> file specified on the command line and |
| 37 | links them together into a single module, called the test program. If any LLVM |
| 38 | passes are specified on the command line, it runs these passes on the test |
| 39 | program. If any of the passes crash, or if they produce malformed output (which |
| 40 | causes the verifier to abort), B<bugpoint> starts the crash debugger. |
| 41 | |
| 42 | Otherwise, if the B<-output> option was not specified, B<bugpoint> runs the test |
| 43 | program with the C backend (which is assumed to generate good code) to generate |
| 44 | a reference output. Once B<bugpoint> has a reference output for the test |
| 45 | program, it tries executing it with the selected code generator. If the |
| 46 | selected code generator crashes, B<bugpoint> starts the L</Crash debugger> on |
| 47 | the code generator. Otherwise, if the resulting output differs from the |
| 48 | reference output, it assumes the difference resulted from a code generator |
| 49 | failure, and starts the L</Code generator debugger>. |
| 50 | |
| 51 | Finally, if the output of the selected code generator matches the reference |
| 52 | output, B<bugpoint> runs the test program after all of the LLVM passes have been |
| 53 | applied to it. If its output differs from the reference output, it assumes the |
| 54 | difference resulted from a failure in one of the LLVM passes, and enters the |
| 55 | miscompilation debugger. Otherwise, there is no problem B<bugpoint> can debug. |
| 56 | |
| 57 | =head2 Crash debugger |
| 58 | |
| 59 | If an optimizer or code generator crashes, B<bugpoint> will try as hard as it |
| 60 | can to reduce the list of passes (for optimizer crashes) and the size of the |
| 61 | test program. First, B<bugpoint> figures out which combination of optimizer |
| 62 | passes triggers the bug. This is useful when debugging a problem exposed by |
| 63 | B<gccas>, for example, because it runs over 38 passes. |
| 64 | |
| 65 | Next, B<bugpoint> tries removing functions from the test program, to reduce its |
| 66 | size. Usually it is able to reduce a test program to a single function, when |
| 67 | debugging intraprocedural optimizations. Once the number of functions has been |
| 68 | reduced, it attempts to delete various edges in the control flow graph, to |
| 69 | reduce the size of the function as much as possible. Finally, B<bugpoint> |
| 70 | deletes any individual LLVM instructions whose absence does not eliminate the |
| 71 | failure. At the end, B<bugpoint> should tell you what passes crash, give you a |
| 72 | bytecode file, and give you instructions on how to reproduce the failure with |
| 73 | B<opt>, B<analyze>, or B<llc>. |
| 74 | |
| 75 | =head2 Code generator debugger |
| 76 | |
| 77 | The code generator debugger attempts to narrow down the amount of code that is |
| 78 | being miscompiled by the selected code generator. To do this, it takes the test |
| 79 | program and partitions it into two pieces: one piece which it compiles with the |
| 80 | C backend (into a shared object), and one piece which it runs with either the |
| 81 | JIT or the static compiler (B<llc>). It uses several techniques to reduce the |
| 82 | amount of code pushed through the LLVM code generator, to reduce the potential |
| 83 | scope of the problem. After it is finished, it emits two bytecode files (called |
| 84 | "test" [to be compiled with the code generator] and "safe" [to be compiled with |
| 85 | the C backend], respectively), and instructions for reproducing the problem. |
| 86 | The code generator debugger assumes that the C backend produces good code. |
| 87 | |
| 88 | =head2 Miscompilation debugger |
| 89 | |
| 90 | The miscompilation debugger works similarly to the code generator debugger. It |
| 91 | works by splitting the test program into two pieces, running the optimizations |
| 92 | specified on one piece, linking the two pieces back together, and then executing |
| 93 | the result. It attempts to narrow down the list of passes to the one (or few) |
| 94 | which are causing the miscompilation, then reduce the portion of the test |
| 95 | program which is being miscompiled. The miscompilation debugger assumes that |
| 96 | the selected code generator is working properly. |
| 97 | |
| 98 | =head2 Advice for using bugpoint |
| 99 | |
| 100 | B<bugpoint> can be a remarkably useful tool, but it sometimes works in |
| 101 | non-obvious ways. Here are some hints and tips: |
| 102 | |
| 103 | =over |
| 104 | |
| 105 | =item * |
| 106 | |
| 107 | In the code generator and miscompilation debuggers, B<bugpoint> only |
| 108 | works with programs that have deterministic output. Thus, if the program |
| 109 | outputs C<argv[0]>, the date, time, or any other "random" data, B<bugpoint> may |
| 110 | misinterpret differences in these data, when output, as the result of a |
| 111 | miscompilation. Programs should be temporarily modified to disable outputs that |
| 112 | are likely to vary from run to run. |
| 113 | |
| 114 | =item * |
| 115 | |
| 116 | In the code generator and miscompilation debuggers, debugging will go faster if |
| 117 | you manually modify the program or its inputs to reduce the runtime, but still |
| 118 | exhibit the problem. |
| 119 | |
| 120 | =item * |
| 121 | |
| 122 | B<bugpoint> is extremely useful when working on a new optimization: it helps |
| 123 | track down regressions quickly. To avoid having to relink B<bugpoint> every |
| 124 | time you change your optimization, make B<bugpoint> dynamically load |
| 125 | your optimization by using the B<-load> option. |
| 126 | |
| 127 | =item * |
| 128 | |
| 129 | B<bugpoint> can generate a lot of output and run for a long period of time. It |
| 130 | is often useful to capture the output of the program to file. For example, in |
| 131 | the C shell, you can type: |
| 132 | |
| 133 | bugpoint ... |& tee bugpoint.log |
| 134 | |
| 135 | to get a copy of B<bugpoint>'s output in the file F<bugpoint.log>, as well as on |
| 136 | your terminal. |
| 137 | |
| 138 | =item * |
| 139 | |
| 140 | B<bugpoint> cannot debug problems with the LLVM linker. If B<bugpoint> crashes |
| 141 | before you see its C<All input ok> message, you might try running C<llvm-link |
| 142 | -v> on the same set of input files. If that also crashes, you may be |
| 143 | experiencing a linker bug. |
| 144 | |
| 145 | =item * |
| 146 | |
| 147 | If your program is supposed to crash, B<bugpoint> will be confused. One way to |
| 148 | deal with this is to cause B<bugpoint> to ignore the exit code from your |
| 149 | program, by giving it the B<-check-exit-code=false> option. |
| 150 | |
| 151 | =back |
| 152 | |
| 153 | =head1 OPTIONS |
| 154 | |
| 155 | =over |
| 156 | |
| 157 | =item B<--additional-so> F<library> |
| 158 | |
| 159 | Load the dynamic shared object F<library> into the test program whenever it is |
| 160 | run. This is useful if you are debugging programs which depend on non-LLVM |
| 161 | libraries (such as the X or curses libraries) to run. |
| 162 | |
| 163 | =item B<--args> I<program args> |
| 164 | |
| 165 | Pass all arguments specified after -args to the test program whenever it runs. |
| 166 | Note that if any of the I<program args> start with a '-', you should use: |
| 167 | |
| 168 | bugpoint [bugpoint args] --args -- [program args] |
| 169 | |
| 170 | The "--" right after the B<--args> option tells B<bugpoint> to consider any |
| 171 | options starting with C<-> to be part of the B<--args> option, not as options to |
| 172 | B<bugpoint> itself. |
| 173 | |
| 174 | =item B<--tool-args> I<tool args> |
| 175 | |
| 176 | Pass all arguments specified after --tool-args to the LLVM tool under test |
| 177 | (B<llc>, B<lli>, etc.) whenever it runs. You should use this option in the |
| 178 | following way: |
| 179 | |
| 180 | bugpoint [bugpoint args] --tool-args -- [tool args] |
| 181 | |
| 182 | The "--" right after the B<--tool-args> option tells B<bugpoint> to consider any |
| 183 | options starting with C<-> to be part of the B<--tool-args> option, not as |
| 184 | options to B<bugpoint> itself. (See B<--args>, above.) |
| 185 | |
| 186 | =item B<--check-exit-code>=I<{true,false}> |
| 187 | |
| 188 | Assume a non-zero exit code or core dump from the test program is a failure. |
| 189 | Defaults to true. |
| 190 | |
| 191 | =item B<--disable-{dce,simplifycfg}> |
| 192 | |
| 193 | Do not run the specified passes to clean up and reduce the size of the test |
| 194 | program. By default, B<bugpoint> uses these passes internally when attempting to |
| 195 | reduce test programs. If you're trying to find a bug in one of these passes, |
| 196 | B<bugpoint> may crash. |
| 197 | |
| 198 | =item B<--help> |
| 199 | |
| 200 | Print a summary of command line options. |
| 201 | |
| 202 | =item B<--input> F<filename> |
| 203 | |
| 204 | Open F<filename> and redirect the standard input of the test program, whenever |
| 205 | it runs, to come from that file. |
| 206 | |
| 207 | =item B<--load> F<plugin> |
| 208 | |
| 209 | Load the dynamic object F<plugin> into B<bugpoint> itself. This object should |
| 210 | register new optimization passes. Once loaded, the object will add new command |
| 211 | line options to enable various optimizations. To see the new complete list of |
| 212 | optimizations, use the B<--help> and B<--load> options together; for example: |
| 213 | |
| 214 | bugpoint --load myNewPass.so --help |
| 215 | |
| 216 | =item B<--output> F<filename> |
| 217 | |
| 218 | Whenever the test program produces output on its standard output stream, it |
| 219 | should match the contents of F<filename> (the "reference output"). If you |
| 220 | do not use this option, B<bugpoint> will attempt to generate a reference output |
| 221 | by compiling the program with the C backend and running it. |
| 222 | |
| 223 | =item B<--profile-info-file> F<filename> |
| 224 | |
| 225 | Profile file loaded by B<--profile-loader>. |
| 226 | |
| 227 | =item B<--run-{int,jit,llc,cbe}> |
| 228 | |
| 229 | Whenever the test program is compiled, B<bugpoint> should generate code for it |
| 230 | using the specified code generator. These options allow you to choose the |
| 231 | interpreter, the JIT compiler, the static native code compiler, or the C |
| 232 | backend, respectively. |
| 233 | |
| 234 | =back |
| 235 | |
| 236 | =head1 EXIT STATUS |
| 237 | |
| 238 | If B<bugpoint> succeeds in finding a problem, it will exit with 0. Otherwise, |
| 239 | if an error occurs, it will exit with a non-zero value. |
| 240 | |
| 241 | =head1 SEE ALSO |
| 242 | |
| 243 | L<opt>, L<analyze> |
| 244 | |
| 245 | =head1 AUTHOR |
| 246 | |
| 247 | Maintained by the LLVM Team (L<http://llvm.cs.uiuc.edu>). |
| 248 | |
| 249 | =cut |
| 250 | |