blob: e896b9503dae65ada9bea90410de05b096765211 [file] [log] [blame]
Justin Bognerfd5b2a02017-10-12 01:44:24 +00001================================
2Fuzzing LLVM libraries and tools
3================================
4
5.. contents::
6 :local:
7 :depth: 2
8
9Introduction
10============
11
12The LLVM tree includes a number of fuzzers for various components. These are
13built on top of :doc:`LibFuzzer <LibFuzzer>`.
14
15
16Available Fuzzers
17=================
18
19clang-fuzzer
20------------
21
22A |generic fuzzer| that tries to compile textual input as C++ code. Some of the
Justin Bogner857ec152017-10-12 02:04:39 +000023bugs this fuzzer has reported are `on bugzilla`__ and `on OSS Fuzz's
24tracker`__.
25
26__ https://llvm.org/pr23057
27__ https://bugs.chromium.org/p/oss-fuzz/issues/list?q=proj-llvm+clang-fuzzer
Justin Bognerfd5b2a02017-10-12 01:44:24 +000028
29clang-proto-fuzzer
30------------------
31
32A |protobuf fuzzer| that compiles valid C++ programs generated from a protobuf
33class that describes a subset of the C++ language.
34
35This fuzzer accepts clang command line options after `ignore_remaining_args=1`.
36For example, the following command will fuzz clang with a higher optimization
37level:
38
39.. code-block:: shell
40
41 % bin/clang-proto-fuzzer <corpus-dir> -ignore_remaining_args=1 -O3
42
43clang-format-fuzzer
44-------------------
45
46A |generic fuzzer| that runs clang-format_ on C++ text fragments. Some of the
Justin Bogner857ec152017-10-12 02:04:39 +000047bugs this fuzzer has reported are `on bugzilla`__
48and `on OSS Fuzz's tracker`__.
Justin Bognerfd5b2a02017-10-12 01:44:24 +000049
50.. _clang-format: https://clang.llvm.org/docs/ClangFormat.html
Justin Bogner857ec152017-10-12 02:04:39 +000051__ https://llvm.org/pr23052
52__ https://bugs.chromium.org/p/oss-fuzz/issues/list?q=proj-llvm+clang-format-fuzzer
Justin Bognerfd5b2a02017-10-12 01:44:24 +000053
54llvm-as-fuzzer
55--------------
56
57A |generic fuzzer| that tries to parse text as :doc:`LLVM assembly <LangRef>`.
Justin Bogner857ec152017-10-12 02:04:39 +000058Some of the bugs this fuzzer has reported are `on bugzilla`__.
59
60__ https://llvm.org/pr24639
Justin Bognerfd5b2a02017-10-12 01:44:24 +000061
62llvm-dwarfdump-fuzzer
63---------------------
64
65A |generic fuzzer| that interprets inputs as object files and runs
66:doc:`llvm-dwarfdump <CommandGuide/llvm-dwarfdump>` on them. Some of the bugs
Justin Bogner857ec152017-10-12 02:04:39 +000067this fuzzer has reported are `on OSS Fuzz's tracker`__
68
69__ https://bugs.chromium.org/p/oss-fuzz/issues/list?q=proj-llvm+llvm-dwarfdump-fuzzer
Justin Bognerfd5b2a02017-10-12 01:44:24 +000070
71llvm-isel-fuzzer
72----------------
73
74A |LLVM IR fuzzer| aimed at finding bugs in instruction selection.
75
76This fuzzer accepts flags after `ignore_remaining_args=1`. The flags match
77those of :doc:`llc <CommandGuide/llc>` and the triple is required. For example,
78the following command would fuzz AArch64 with :doc:`GlobalISel`:
79
80.. code-block:: shell
81
82 % bin/llvm-isel-fuzzer <corpus-dir> -ignore_remaining_args=1 -mtriple aarch64 -global-isel -O0
83
Justin Bogner9ea7fbd2017-10-12 04:35:32 +000084Some flags can also be specified in the binary name itself in order to support
85OSS Fuzz, which has trouble with required arguments. To do this, you can copy
86or move ``llvm-isel-fuzzer`` to ``llvm-isel-fuzzer=x-y-z``, where x, y, and z
87are architecture names (``aarch64``, ``x86_64``), optimization levels (``O0``,
88``O2``), or specific keywords like ``gisel`` for enabling global instruction
89selection.
90
Justin Bognerfd5b2a02017-10-12 01:44:24 +000091llvm-mc-assemble-fuzzer
92-----------------------
93
94A |generic fuzzer| that fuzzes the MC layer's assemblers by treating inputs as
95target specific assembly.
96
97Note that this fuzzer has an unusual command line interface which is not fully
98compatible with all of libFuzzer's features. Fuzzer arguments must be passed
99after ``--fuzzer-args``, and any ``llc`` flags must use two dashes. For
100example, to fuzz the AArch64 assembler you might use the following command:
101
102.. code-block:: console
103
104 llvm-mc-fuzzer --triple=aarch64-linux-gnu --fuzzer-args -max_len=4
105
106This scheme will likely change in the future.
107
108llvm-mc-disassemble-fuzzer
109--------------------------
110
111A |generic fuzzer| that fuzzes the MC layer's disassemblers by treating inputs
112as assembled binary data.
113
114Note that this fuzzer has an unusual command line interface which is not fully
115compatible with all of libFuzzer's features. See the notes above about
116``llvm-mc-assemble-fuzzer`` for details.
117
118
119.. |generic fuzzer| replace:: :ref:`generic fuzzer <fuzzing-llvm-generic>`
120.. |protobuf fuzzer|
121 replace:: :ref:`libprotobuf-mutator based fuzzer <fuzzing-llvm-protobuf>`
122.. |LLVM IR fuzzer|
123 replace:: :ref:`structured LLVM IR fuzzer <fuzzing-llvm-ir>`
124
125
126Mutators and Input Generators
127=============================
128
129The inputs for a fuzz target are generated via random mutations of a
130:ref:`corpus <libfuzzer-corpus>`. There are a few options for the kinds of
131mutations that a fuzzer in LLVM might want.
132
133.. _fuzzing-llvm-generic:
134
135Generic Random Fuzzing
136----------------------
137
138The most basic form of input mutation is to use the built in mutators of
139LibFuzzer. These simply treat the input corpus as a bag of bits and make random
140mutations. This type of fuzzer is good for stressing the surface layers of a
141program, and is good at testing things like lexers, parsers, or binary
142protocols.
143
144Some of the in-tree fuzzers that use this type of mutator are `clang-fuzzer`_,
145`clang-format-fuzzer`_, `llvm-as-fuzzer`_, `llvm-dwarfdump-fuzzer`_,
146`llvm-mc-assemble-fuzzer`_, and `llvm-mc-disassemble-fuzzer`_.
147
148.. _fuzzing-llvm-protobuf:
149
150Structured Fuzzing using ``libprotobuf-mutator``
151------------------------------------------------
152
153We can use libprotobuf-mutator_ in order to perform structured fuzzing and
154stress deeper layers of programs. This works by defining a protobuf class that
155translates arbitrary data into structurally interesting input. Specifically, we
156use this to work with a subset of the C++ language and perform mutations that
157produce valid C++ programs in order to exercise parts of clang that are more
158interesting than parser error handling.
159
160To build this kind of fuzzer you need `protobuf`_ and its dependencies
161installed, and you need to specify some extra flags when configuring the build
162with :doc:`CMake <CMake>`. For example, `clang-proto-fuzzer`_ can be enabled by
163adding ``-DCLANG_ENABLE_PROTO_FUZZER=ON`` to the flags described in
164:ref:`building-fuzzers`.
165
166The only in-tree fuzzer that uses ``libprotobuf-mutator`` today is
167`clang-proto-fuzzer`_.
168
169.. _libprotobuf-mutator: https://github.com/google/libprotobuf-mutator
170.. _protobuf: https://github.com/google/protobuf
171
172.. _fuzzing-llvm-ir:
173
174Structured Fuzzing of LLVM IR
175-----------------------------
176
177We also use a more direct form of structured fuzzing for fuzzers that take
178:doc:`LLVM IR <LangRef>` as input. This is achieved through the ``FuzzMutate``
179library, which was `discussed at EuroLLVM 2017`_.
180
181The ``FuzzMutate`` library is used to structurally fuzz backends in
182`llvm-isel-fuzzer`_.
183
184.. _discussed at EuroLLVM 2017: https://www.youtube.com/watch?v=UBbQ_s6hNgg
185
186
187Building and Running
188====================
189
190.. _building-fuzzers:
191
192Configuring LLVM to Build Fuzzers
193---------------------------------
194
195Fuzzers will be built and linked to libFuzzer by default as long as you build
196LLVM with sanitizer coverage enabled. You would typically also enable at least
197one sanitizer for the fuzzers to be particularly likely, so the most common way
198to build the fuzzers is by adding the following two flags to your CMake
199invocation: ``-DLLVM_USE_SANITIZER=Address -DLLVM_USE_SANITIZE_COVERAGE=On``.
200
201.. note:: If you have ``compiler-rt`` checked out in an LLVM tree when building
202 with sanitizers, you'll want to specify ``-DLLVM_BUILD_RUNTIME=Off``
203 to avoid building the sanitizers themselves with sanitizers enabled.
204
205Continuously Running and Finding Bugs
206-------------------------------------
207
208There used to be a public buildbot running LLVM fuzzers continuously, and while
209this did find issues, it didn't have a very good way to report problems in an
210actionable way. Because of this, we're moving towards using `OSS Fuzz`_ more
211instead.
212
Justin Bogner8d85ced2017-10-12 02:28:26 +0000213You can browse the `LLVM project issue list`_ for the bugs found by
214`LLVM on OSS Fuzz`_. These are also mailed to the `llvm-bugs mailing
215list`_.
Justin Bognerfd5b2a02017-10-12 01:44:24 +0000216
217.. _OSS Fuzz: https://github.com/google/oss-fuzz
Justin Bogner8d85ced2017-10-12 02:28:26 +0000218.. _LLVM project issue list:
219 https://bugs.chromium.org/p/oss-fuzz/issues/list?q=Proj-llvm
220.. _LLVM on OSS Fuzz:
221 https://github.com/google/oss-fuzz/blob/master/projects/llvm
222.. _llvm-bugs mailing list:
223 http://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-bugs
Justin Bognerfd5b2a02017-10-12 01:44:24 +0000224
225
226Utilities for Writing Fuzzers
227=============================
228
229There are some utilities available for writing fuzzers in LLVM.
230
231Some helpers for handling the command line interface are available in
232``include/llvm/FuzzMutate/FuzzerCLI.h``, including functions to parse command
233line options in a consistent way and to implement standalone main functions so
234your fuzzer can be built and tested when not built against libFuzzer.
235
236There is also some handling of the CMake config for fuzzers, where you should
237use the ``add_llvm_fuzzer`` to set up fuzzer targets. This function works
238similarly to functions such as ``add_llvm_tool``, but they take care of linking
239to LibFuzzer when appropriate and can be passed the ``DUMMY_MAIN`` argument to
240enable standalone testing.