| Brian Gaeke | ff0c766 | 2004-07-01 20:29:08 +0000 | [diff] [blame^] | 1 | =pod | 
 | 2 |  | 
 | 3 | =head1 NAME | 
 | 4 |  | 
 | 5 | bugpoint - automatic test case reduction tool | 
 | 6 |  | 
 | 7 | =head1 SYNOPSIS | 
 | 8 |  | 
 | 9 | bugpoint [options] [input LLVM ll/bc files] [LLVM passes] --args | 
 | 10 | I<program arguments> ... | 
 | 11 |  | 
 | 12 | =head1 DESCRIPTION | 
 | 13 |  | 
 | 14 | The B<bugpoint> tool narrows down the source of problems in LLVM tools and passes. | 
 | 15 | It can be used to debug three types of failures: optimizer crashes, | 
 | 16 | miscompilations by optimizers, or bad native code generation (including problems | 
 | 17 | in the static and JIT compilers).  It aims to reduce large test cases to small, | 
 | 18 | useful ones.  For example, if B<gccas> crashes while optimizing a file, it will | 
 | 19 | identify the optimization (or combination of optimizations) that causes the | 
 | 20 | crash, and reduce the file down to a small example which triggers the crash. | 
 | 21 |  | 
 | 22 | =head2 Design Philosophy | 
 | 23 |  | 
 | 24 | B<bugpoint> is designed to be a useful tool without requiring any hooks into the | 
 | 25 | LLVM infrastructure at all.  It works with any and all LLVM passes and code | 
 | 26 | generators, and does not need to "know" how they work.  Because of this, it may | 
 | 27 | appear to do stupid things or miss obvious simplifications.  B<bugpoint> is also | 
 | 28 | designed to trade off programmer time for computer time in the | 
 | 29 | compiler-debugging process; consequently, it may take a long period of | 
 | 30 | (unattended) time to reduce a test case, but we feel it is still worth it. Note | 
 | 31 | that B<bugpoint> is generally very quick unless debugging a miscompilation where | 
 | 32 | each test of the program (which requires executing it) takes a long time. | 
 | 33 |  | 
 | 34 | =head2 Automatic Debugger Selection | 
 | 35 |  | 
 | 36 | B<bugpoint> reads each F<.bc> or F<.ll> file specified on the command line and | 
 | 37 | links them together into a single module, called the test program.  If any LLVM | 
 | 38 | passes are specified on the command line, it runs these passes on the test | 
 | 39 | program.  If any of the passes crash, or if they produce malformed output (which | 
 | 40 | causes the verifier to abort), B<bugpoint> starts the crash debugger. | 
 | 41 |  | 
 | 42 | Otherwise, if the B<-output> option was not specified, B<bugpoint> runs the test | 
 | 43 | program with the C backend (which is assumed to generate good code) to generate | 
 | 44 | a reference output.  Once B<bugpoint> has a reference output for the test | 
 | 45 | program, it tries executing it with the selected code generator.  If the | 
 | 46 | selected code generator crashes, B<bugpoint> starts the L</Crash debugger> on | 
 | 47 | the code generator.  Otherwise, if the resulting output differs from the | 
 | 48 | reference output, it assumes the difference resulted from a code generator | 
 | 49 | failure, and starts the L</Code generator debugger>. | 
 | 50 |  | 
 | 51 | Finally, if the output of the selected code generator matches the reference | 
 | 52 | output, B<bugpoint> runs the test program after all of the LLVM passes have been | 
 | 53 | applied to it.  If its output differs from the reference output, it assumes the | 
 | 54 | difference resulted from a failure in one of the LLVM passes, and enters the | 
 | 55 | miscompilation debugger. Otherwise, there is no problem B<bugpoint> can debug. | 
 | 56 |  | 
 | 57 | =head2 Crash debugger | 
 | 58 |  | 
 | 59 | If an optimizer or code generator crashes, B<bugpoint> will try as hard as it | 
 | 60 | can to reduce the list of passes (for optimizer crashes) and the size of the | 
 | 61 | test program.  First, B<bugpoint> figures out which combination of optimizer | 
 | 62 | passes triggers the bug. This is useful when debugging a problem exposed by | 
 | 63 | B<gccas>, for example, because it runs over 38 passes. | 
 | 64 |  | 
 | 65 | Next, B<bugpoint> tries removing functions from the test program, to reduce its | 
 | 66 | size.  Usually it is able to reduce a test program to a single function, when | 
 | 67 | debugging intraprocedural optimizations.  Once the number of functions has been | 
 | 68 | reduced, it attempts to delete various edges in the control flow graph, to | 
 | 69 | reduce the size of the function as much as possible.  Finally, B<bugpoint> | 
 | 70 | deletes any individual LLVM instructions whose absence does not eliminate the | 
 | 71 | failure.  At the end, B<bugpoint> should tell you what passes crash, give you a | 
 | 72 | bytecode file, and give you instructions on how to reproduce the failure with | 
 | 73 | B<opt>, B<analyze>, or B<llc>. | 
 | 74 |  | 
 | 75 | =head2 Code generator debugger | 
 | 76 |  | 
 | 77 | The code generator debugger attempts to narrow down the amount of code that is | 
 | 78 | being miscompiled by the selected code generator.  To do this, it takes the test | 
 | 79 | program and partitions it into two pieces: one piece which it compiles with the | 
 | 80 | C backend (into a shared object), and one piece which it runs with either the | 
 | 81 | JIT or the static compiler (B<llc>).  It uses several techniques to reduce the | 
 | 82 | amount of code pushed through the LLVM code generator, to reduce the potential | 
 | 83 | scope of the problem.  After it is finished, it emits two bytecode files (called | 
 | 84 | "test" [to be compiled with the code generator] and "safe" [to be compiled with | 
 | 85 | the C backend], respectively), and instructions for reproducing the problem. | 
 | 86 | The code generator debugger assumes that the C backend produces good code. | 
 | 87 |  | 
 | 88 | =head2 Miscompilation debugger | 
 | 89 |  | 
 | 90 | The miscompilation debugger works similarly to the code generator debugger.  It | 
 | 91 | works by splitting the test program into two pieces, running the optimizations | 
 | 92 | specified on one piece, linking the two pieces back together, and then executing | 
 | 93 | the result.  It attempts to narrow down the list of passes to the one (or few) | 
 | 94 | which are causing the miscompilation, then reduce the portion of the test | 
 | 95 | program which is being miscompiled.  The miscompilation debugger assumes that | 
 | 96 | the selected code generator is working properly. | 
 | 97 |  | 
 | 98 | =head2 Advice for using bugpoint | 
 | 99 |  | 
 | 100 | B<bugpoint> can be a remarkably useful tool, but it sometimes works in | 
 | 101 | non-obvious ways.  Here are some hints and tips: | 
 | 102 |  | 
 | 103 | =over | 
 | 104 |  | 
 | 105 | =item * | 
 | 106 |  | 
 | 107 | In the code generator and miscompilation debuggers, B<bugpoint> only | 
 | 108 | works with programs that have deterministic output.  Thus, if the program | 
 | 109 | outputs C<argv[0]>, the date, time, or any other "random" data, B<bugpoint> may | 
 | 110 | misinterpret differences in these data, when output, as the result of a | 
 | 111 | miscompilation.  Programs should be temporarily modified to disable outputs that | 
 | 112 | are likely to vary from run to run. | 
 | 113 |  | 
 | 114 | =item * | 
 | 115 |  | 
 | 116 | In the code generator and miscompilation debuggers, debugging will go faster if | 
 | 117 | you manually modify the program or its inputs to reduce the runtime, but still | 
 | 118 | exhibit the problem. | 
 | 119 |  | 
 | 120 | =item * | 
 | 121 |  | 
 | 122 | B<bugpoint> is extremely useful when working on a new optimization: it helps | 
 | 123 | track down regressions quickly.  To avoid having to relink B<bugpoint> every | 
 | 124 | time you change your optimization, make B<bugpoint> dynamically load | 
 | 125 | your optimization by using the B<-load> option. | 
 | 126 |  | 
 | 127 | =item * | 
 | 128 |  | 
 | 129 | B<bugpoint> can generate a lot of output and run for a long period of time.  It | 
 | 130 | is often useful to capture the output of the program to file.  For example, in | 
 | 131 | the C shell, you can type: | 
 | 132 |  | 
 | 133 |     bugpoint ... |& tee bugpoint.log | 
 | 134 |  | 
 | 135 | to get a copy of B<bugpoint>'s output in the file F<bugpoint.log>, as well as on | 
 | 136 | your terminal. | 
 | 137 |  | 
 | 138 | =item * | 
 | 139 |  | 
 | 140 | B<bugpoint> cannot debug problems with the LLVM linker. If B<bugpoint> crashes | 
 | 141 | before you see its C<All input ok> message, you might try running C<llvm-link | 
 | 142 | -v> on the same set of input files. If that also crashes, you may be | 
 | 143 | experiencing a linker bug. | 
 | 144 |  | 
 | 145 | =item * | 
 | 146 |  | 
 | 147 | If your program is supposed to crash, B<bugpoint> will be confused. One way to | 
 | 148 | deal with this is to cause B<bugpoint> to ignore the exit code from your | 
 | 149 | program, by giving it the B<-check-exit-code=false> option. | 
 | 150 |  | 
 | 151 | =back | 
 | 152 |  | 
 | 153 | =head1 OPTIONS | 
 | 154 |  | 
 | 155 | =over  | 
 | 156 |  | 
 | 157 | =item B<--additional-so> F<library> | 
 | 158 |  | 
 | 159 | Load the dynamic shared object F<library> into the test program whenever it is | 
 | 160 | run.  This is useful if you are debugging programs which depend on non-LLVM | 
 | 161 | libraries (such as the X or curses libraries) to run. | 
 | 162 |  | 
 | 163 | =item B<--args> I<program args> | 
 | 164 |  | 
 | 165 | Pass all arguments specified after -args to the test program whenever it runs. | 
 | 166 | Note that if any of the I<program args> start with a '-', you should use: | 
 | 167 |  | 
 | 168 |     bugpoint [bugpoint args] --args -- [program args] | 
 | 169 |  | 
 | 170 | The "--" right after the B<--args> option tells B<bugpoint> to consider any | 
 | 171 | options starting with C<-> to be part of the B<--args> option, not as options to | 
 | 172 | B<bugpoint> itself. | 
 | 173 |  | 
 | 174 | =item B<--tool-args> I<tool args> | 
 | 175 |  | 
 | 176 | Pass all arguments specified after --tool-args to the LLVM tool under test | 
 | 177 | (B<llc>, B<lli>, etc.) whenever it runs.  You should use this option in the | 
 | 178 | following way: | 
 | 179 |  | 
 | 180 |     bugpoint [bugpoint args] --tool-args -- [tool args] | 
 | 181 |  | 
 | 182 | The "--" right after the B<--tool-args> option tells B<bugpoint> to consider any | 
 | 183 | options starting with C<-> to be part of the B<--tool-args> option, not as | 
 | 184 | options to B<bugpoint> itself. (See B<--args>, above.) | 
 | 185 |  | 
 | 186 | =item B<--check-exit-code>=I<{true,false}> | 
 | 187 |  | 
 | 188 | Assume a non-zero exit code or core dump from the test program is a failure. | 
 | 189 | Defaults to true. | 
 | 190 |  | 
 | 191 | =item B<--disable-{dce,simplifycfg}> | 
 | 192 |  | 
 | 193 | Do not run the specified passes to clean up and reduce the size of the test | 
 | 194 | program. By default, B<bugpoint> uses these passes internally when attempting to | 
 | 195 | reduce test programs.  If you're trying to find a bug in one of these passes, | 
 | 196 | B<bugpoint> may crash. | 
 | 197 |  | 
 | 198 | =item B<--help> | 
 | 199 |  | 
 | 200 | Print a summary of command line options. | 
 | 201 |  | 
 | 202 | =item B<--input> F<filename> | 
 | 203 |  | 
 | 204 | Open F<filename> and redirect the standard input of the test program, whenever | 
 | 205 | it runs, to come from that file. | 
 | 206 |  | 
 | 207 | =item B<--load> F<plugin> | 
 | 208 |  | 
 | 209 | Load the dynamic object F<plugin> into B<bugpoint> itself.  This object should | 
 | 210 | register new optimization passes.  Once loaded, the object will add new command | 
 | 211 | line options to enable various optimizations.  To see the new complete list of | 
 | 212 | optimizations, use the B<--help> and B<--load> options together; for example: | 
 | 213 |  | 
 | 214 |     bugpoint --load myNewPass.so --help | 
 | 215 |  | 
 | 216 | =item B<--output> F<filename> | 
 | 217 |  | 
 | 218 | Whenever the test program produces output on its standard output stream, it | 
 | 219 | should match the contents of F<filename> (the "reference output"). If you | 
 | 220 | do not use this option, B<bugpoint> will attempt to generate a reference output | 
 | 221 | by compiling the program with the C backend and running it. | 
 | 222 |  | 
 | 223 | =item B<--profile-info-file> F<filename> | 
 | 224 |  | 
 | 225 | Profile file loaded by B<--profile-loader>. | 
 | 226 |  | 
 | 227 | =item B<--run-{int,jit,llc,cbe}> | 
 | 228 |  | 
 | 229 | Whenever the test program is compiled, B<bugpoint> should generate code for it | 
 | 230 | using the specified code generator.  These options allow you to choose the | 
 | 231 | interpreter, the JIT compiler, the static native code compiler, or the C | 
 | 232 | backend, respectively. | 
 | 233 |  | 
 | 234 | =back | 
 | 235 |  | 
 | 236 | =head1 EXIT STATUS | 
 | 237 |  | 
 | 238 | If B<bugpoint> succeeds in finding a problem, it will exit with 0.  Otherwise, | 
 | 239 | if an error occurs, it will exit with a non-zero value. | 
 | 240 |  | 
 | 241 | =head1 SEE ALSO | 
 | 242 |  | 
 | 243 | L<opt>, L<analyze> | 
 | 244 |  | 
 | 245 | =head1 AUTHOR | 
 | 246 |  | 
 | 247 | Maintained by the LLVM Team (L<http://llvm.cs.uiuc.edu>). | 
 | 248 |  | 
 | 249 | =cut | 
 | 250 |  |