blob: 07bbdda157bde2daf419b8a7a41e43b11cb2121d [file] [log] [blame] [view]
Fabian Meumertzheim5246e522021-01-29 16:20:19 +01001<img src="https://www.code-intelligence.com/hubfs/Logos/CI%20Logos/Jazzer_einfach.png" height=150px alt="Jazzer logo">
2
Fabian Meumertzheim95560702021-02-20 12:52:28 +01003
Fabian Meumertzheim5246e522021-01-29 16:20:19 +01004# Jazzer
Fabian Meumertzheim95560702021-02-20 12:52:28 +01005[![Maven Central](https://img.shields.io/maven-central/v/com.code-intelligence/jazzer-api)](https://search.maven.org/search?q=g:com.code-intelligence%20a:jazzer-api)
6![GitHub Actions](https://github.com/CodeIntelligenceTesting/jazzer/workflows/Build%20all%20targets%20and%20run%20all%20tests/badge.svg)
Fabian Meumertzheima257ff52021-03-26 11:53:38 +01007[![Fuzzing Status](https://oss-fuzz-build-logs.storage.googleapis.com/badges/java-example.svg)](https://bugs.chromium.org/p/oss-fuzz/issues/list?sort=-opened&can=1&q=proj:java-example)
Fabian Meumertzheim5246e522021-01-29 16:20:19 +01008
9Jazzer is a coverage-guided, in-process fuzzer for the JVM platform developed by [Code Intelligence](https://code-intelligence.com).
10It is based on [libFuzzer](https://llvm.org/docs/LibFuzzer.html) and brings many of its instrumentation-powered mutation features to the JVM.
11
12The JVM bytecode is executed inside the fuzzer process, which ensures fast execution speeds and allows seamless fuzzing of
13native libraries.
14
Fabian Meumertzheimafdf70c2021-12-08 17:01:49 +010015Jazzer supports Linux and (experimentally) macOS 10.15 and 11 as well as Windows, all on the x64 architecture.
Fabian Meumertzheime3b66f72021-03-10 09:52:02 +010016
Fabian Meumertzheim00f81b52021-03-10 18:05:04 +010017## News: Jazzer available in OSS-Fuzz
18
19[Code Intelligence](https://code-intelligence.com) and Google have teamed up to bring support for Java, Kotlin, and other JVM-based languages to [OSS-Fuzz](https://github.com/google/oss-fuzz), Google's project for large-scale fuzzing of open-souce software. Read [the blogpost](https://security.googleblog.com/2021/03/fuzzing-java-in-oss-fuzz.html) over at the Google Security Blog.
20
Fabian Meumertzheimd49257e2021-03-26 12:56:02 +010021If you want to learn more about Jazzer and OSS-Fuzz, [watch the FuzzCon 2020 talk](https://www.youtube.com/watch?v=SmH3Ys_k8vA&list=PLI0R_0_8-TV55gJU-UXrOzZoPbVOj1CW6&index=3) by [Abhishek Arya](https://twitter.com/infernosec) and [Fabian Meumertzheim](https://twitter.com/fhenneke).
Fabian Meumertzheim00f81b52021-03-10 18:05:04 +010022
Fabian Meumertzheim5246e522021-01-29 16:20:19 +010023## Installation
24
Fabian Meumertzheimcaea0232021-09-14 18:03:41 +020025The preferred way to install Jazzer is to compile it from source using [Bazel](https://bazel.build), but binary distributions for x64 Linux as well as a Docker image are also available.
26Note that these binaries might be outdated as Jazzer follows the "Live at Head" philosophy - you should be able to just checkout the latest commit from `main` and build it.
Fabian Meumertzheimd49d8912021-05-11 16:09:56 +020027
28Support for Jazzer has recently been added to [rules_fuzzing](https://github.com/bazelbuild/rules_fuzzing), the official Bazel rules for fuzzing. See their README for instructions on how to use Jazzer in a Java Bazel project.
Fabian Meumertzheim5246e522021-01-29 16:20:19 +010029
Fabian Meumertzheimcaea0232021-09-14 18:03:41 +020030### Using Docker
31
32The "distroless" Docker image [cifuzz/jazzer](https://hub.docker.com/r/cifuzz/jazzer) includes Jazzer together with OpenJDK 11. Just mount a directory containing your compiled fuzz target into the container under `/fuzzing` by running:
33
34```sh
35docker run -v path/containing/the/application:/fuzzing cifuzz/jazzer <arguments>
36```
37
38If Jazzer produces a finding, the input that triggered it will be available in the same directory.
39
Fabian Meumertzheim5246e522021-01-29 16:20:19 +010040### Using Bazel
41
42Jazzer has the following dependencies when being built from source:
43
44* JDK 8 or later (e.g. [OpenJDK](https://openjdk.java.net/))
45* [Clang](https://clang.llvm.org/) 9.0 or later (using a recent version is strongly recommended)
46
Fabian Meumertzheim338f3842021-05-31 09:18:16 +020047#### Linux
48
Fabian Meumertzheime3b66f72021-03-10 09:52:02 +010049Jazzer uses [Bazelisk](https://github.com/bazelbuild/bazelisk) to automatically download and install Bazel on Linux.
50Building Jazzer from source and running it thus only requires the following assuming the dependencies are installed:
Fabian Meumertzheim5246e522021-01-29 16:20:19 +010051
52```bash
53git clone https://github.com/CodeIntelligenceTesting/jazzer
54cd jazzer
Fabian Meumertzheime3b66f72021-03-10 09:52:02 +010055# Note the double dash used to pass <arguments> to Jazzer rather than Bazel.
Fabian Meumertzheim5246e522021-01-29 16:20:19 +010056./bazelisk-linux-amd64 run //:jazzer -- <arguments>
57```
58
Fabian Meumertzheim338f3842021-05-31 09:18:16 +020059If you prefer to build binaries that can be run without Bazel, use the following command to build your own archive with release binaries:
60
61```bash
62$ ./bazelisk-linux-amd64 build //:jazzer_release
63...
64INFO: Found 1 target...
65Target //:jazzer_release up-to-date:
66 bazel-bin/jazzer_release.tar.gz
67...
68```
69
70This will print the path of a `jazzer_release.tar.gz` archive that contains the same binaries that would be part of a release.
71
72#### macOS
73
74Since Jazzer does not ship the macOS version of [Bazelisk](https://github.com/bazelbuild/bazelisk), a tool that automatically downloads and installs the correct version of Bazel, download [the most recent release](https://github.com/bazelbuild/bazelisk/releases) of `bazelisk-darwin`.
75Afterwards, clone Jazzer and run it via:
76
77```bash
78git clone https://github.com/CodeIntelligenceTesting/jazzer
79cd jazzer
80# Note the double dash used to pass <arguments> to Jazzer rather than Bazel.
81/path/to/bazelisk-darwin run //:jazzer -- <arguments>
82```
83
84If you prefer to build binaries that can be run without Bazel, use the following command to build your own archive with release binaries:
85
86```bash
87$ /path/to/bazelisk-darwin build //:jazzer_release
88...
89INFO: Found 1 target...
90Target //:jazzer_release up-to-date:
91 bazel-bin/jazzer_release.tar.gz
92...
93```
94
95This will print the path of a `jazzer_release.tar.gz` archive that contains the same binaries that would be part of a release.
Fabian Meumertzheime3b66f72021-03-10 09:52:02 +010096
Fabian Meumertzheim89285d82021-08-04 10:13:18 +020097The build may fail with the clang shipped with Xcode. If you encounter issues during the build, add `--config=toolchain`
98right after `run` or `build` in the `bazelisk` commands above to use a checked-in toolchain that is known to work.
99
Fabian Meumertzheim5246e522021-01-29 16:20:19 +0100100### Using the provided binaries
101
102Binary releases are available under [Releases](https://github.com/CodeIntelligenceTesting/jazzer/releases) and are built
103using an [LLVM 11 Bazel toolchain](https://github.com/CodeIntelligenceTesting/llvm-toolchain).
104
Fabian Meumertzheim338f3842021-05-31 09:18:16 +0200105The binary distributions of Jazzer consist of the following components:
Fabian Meumertzheim5246e522021-01-29 16:20:19 +0100106
107- `jazzer_driver` - native binary that interfaces between libFuzzer and the JVM fuzz target
108- `jazzer_agent_deploy.jar` - Java agent that performs bytecode instrumentation and tracks coverage
109- `jazzer_api_deploy.jar` - contains convenience methods for creating fuzz targets and defining custom hooks
110- `jazzer` - convenience shell script that runs the Jazzer driver with the local JRE shared libraries added to `LD_LIBRARY_PATH`
111
Fabian Meumertzheim338f3842021-05-31 09:18:16 +0200112The additional release artifact `examples_deploy.jar` contains most of the examples and can be used to run them without having to build them (see Examples below).
Fabian Meumertzheim5246e522021-01-29 16:20:19 +0100113
114After unpacking the archive, run Jazzer via
115
116```bash
117./jazzer <arguments>
118```
119
120If this leads to an error message saying that `libjvm.so` has not been found, the path to the local JRE needs to be
121specified in the `JAVA_HOME` environment variable.
122
123## Examples
124
125Multiple examples for instructive and real-world Jazzer fuzz targets can be found in the `examples/` directory.
126A toy example can be run as follows:
127
128```bash
129# Using Bazelisk:
130./bazelisk-linux-amd64 run //examples:ExampleFuzzer
131# Using the binary release and examples_deploy.jar:
Fabian Meumertzheim5b94b7b2021-02-10 15:56:38 +0100132./jazzer --cp=examples_deploy.jar
Fabian Meumertzheim5246e522021-01-29 16:20:19 +0100133```
134
135This should produce output similar to the following:
136
137```
138INFO: Loaded 1 hooks from com.example.ExampleFuzzerHooks
139INFO: Instrumented com.example.ExampleFuzzer (took 81 ms, size +83%)
140INFO: libFuzzer ignores flags that start with '--'
141INFO: Seed: 2735196724
142INFO: Loaded 1 modules (65536 inline 8-bit counters): 65536 [0xe387b0, 0xe487b0),
143INFO: Loaded 1 PC tables (65536 PCs): 65536 [0x7f9353eff010,0x7f9353fff010),
144INFO: -max_len is not provided; libFuzzer will not generate inputs larger than 4096 bytes
145INFO: A corpus is not provided, starting from an empty corpus
146#2 INITED cov: 2 ft: 2 corp: 1/1b exec/s: 0 rss: 94Mb
147#1562 NEW cov: 4 ft: 4 corp: 2/14b lim: 17 exec/s: 0 rss: 98Mb L: 13/13 MS: 5 ShuffleBytes-CrossOver-InsertRepeatedBytes-ShuffleBytes-CMP- DE: "magicstring4"-
148#1759 REDUCE cov: 4 ft: 4 corp: 2/13b lim: 17 exec/s: 0 rss: 99Mb L: 12/12 MS: 2 ChangeBit-EraseBytes-
149#4048 NEW cov: 6 ft: 6 corp: 3/51b lim: 38 exec/s: 0 rss: 113Mb L: 38/38 MS: 4 ChangeBit-ChangeByte-CopyPart-CrossOver-
150#4055 REDUCE cov: 6 ft: 6 corp: 3/49b lim: 38 exec/s: 0 rss: 113Mb L: 36/36 MS: 2 ShuffleBytes-EraseBytes-
151#4266 REDUCE cov: 6 ft: 6 corp: 3/48b lim: 38 exec/s: 0 rss: 113Mb L: 35/35 MS: 1 EraseBytes-
152#4498 REDUCE cov: 6 ft: 6 corp: 3/47b lim: 38 exec/s: 0 rss: 114Mb L: 34/34 MS: 2 EraseBytes-CopyPart-
153#4764 REDUCE cov: 6 ft: 6 corp: 3/46b lim: 38 exec/s: 0 rss: 115Mb L: 33/33 MS: 1 EraseBytes-
154#5481 REDUCE cov: 6 ft: 6 corp: 3/44b lim: 43 exec/s: 0 rss: 116Mb L: 31/31 MS: 2 InsertByte-EraseBytes-
155#131072 pulse cov: 6 ft: 6 corp: 3/44b lim: 1290 exec/s: 65536 rss: 358Mb
156
157== Java Exception: java.lang.IllegalStateException: mustNeverBeCalled has been called
158 at com.example.ExampleFuzzer.mustNeverBeCalled(ExampleFuzzer.java:38)
159 at com.example.ExampleFuzzer.fuzzerTestOneInput(ExampleFuzzer.java:32)
160DEDUP_TOKEN: eb6ee7d9b256590d
161== libFuzzer crashing input ==
162MS: 1 CMP- DE: "\x00C"-; base unit: 04e0ccacb50424e06e45f6184ad45895b6b8df8f
1630x6d,0x61,0x67,0x69,0x63,0x73,0x74,0x72,0x69,0x6e,0x67,0x34,0x74,0x72,0x69,0x6e,0x67,0x34,0x74,0x69,0x67,0x34,0x7b,0x0,0x0,0x43,0x34,0xa,0x0,0x0,0x0,
164magicstring4tring4tig4{\x00\x00C4\x0a\x00\x00\x00
165artifact_prefix='./'; Test unit written to crash-efea1e8fc83a15217d512e20d964040a68a968c3
166Base64: bWFnaWNzdHJpbmc0dHJpbmc0dGlnNHsAAEM0CgAAAA==
167reproducer_path='.'; Java reproducer written to Crash_efea1e8fc83a15217d512e20d964040a68a968c3.java
168```
169
170Here you can see the usual libFuzzer output in case of a crash, augmented with JVM-specific information.
171Instead of a native stack trace, the details of the uncaught Java exception that caused the crash are printed, followed by the fuzzer input that caused the exception to be thrown (if it is not too long).
172More information on what hooks and Java reproducers are can be found below.
173
174See `examples/BUILD.bazel` for the list of all possible example targets.
175
Fabian Meumertzheim5246e522021-01-29 16:20:19 +0100176## Usage
177
178### Creating a fuzz target
179
180Jazzer requires a JVM class containing the entry point for the fuzzer. This is commonly referred to as a "fuzz target" and
181may be as simple as the following Java example:
182
183```java
184package com.example.MyFirstFuzzTarget;
185
Fabian Meumertzheim1b931fd2021-03-04 17:45:53 +0100186public class MyFirstFuzzTarget {
Fabian Meumertzheim541c5c62021-02-23 18:00:48 +0100187 public static void fuzzerTestOneInput(byte[] input) {
Fabian Meumertzheim5246e522021-01-29 16:20:19 +0100188 ...
Fabian Meumertzheim541c5c62021-02-23 18:00:48 +0100189 // Call the function under test with arguments derived from input and
190 // throw an exception if something unwanted happens.
Fabian Meumertzheim5246e522021-01-29 16:20:19 +0100191 ...
Fabian Meumertzheim5246e522021-01-29 16:20:19 +0100192 }
193}
194```
195
196A Java fuzz target class needs to define exactly one of the following functions:
197
Fabian Meumertzheim541c5c62021-02-23 18:00:48 +0100198* `public static void fuzzerTestOneInput(byte[] input)`: Ideal for fuzz targets that naturally work on raw byte input (e.g.
Fabian Meumertzheim5246e522021-01-29 16:20:19 +0100199 image parsers).
Fabian Meumertzheim541c5c62021-02-23 18:00:48 +0100200* `public static void fuzzerTestOneInput(com.code_intelligence.api.FuzzedDataProvider data)`: A variety of types of "fuzzed
Fabian Meumertzheim5246e522021-01-29 16:20:19 +0100201 data" is made available via the `FuzzedDataProvider` interface (see below for more information on this interface).
202
203The fuzzer will repeatedly call this function with generated inputs. All unhandled exceptions are caught and
Fabian Meumertzheim541c5c62021-02-23 18:00:48 +0100204reported as errors.
Fabian Meumertzheim5246e522021-01-29 16:20:19 +0100205
206The optional functions `public static void fuzzerInitialize()` or `public static void fuzzerInitialize(String[] args)`
207can be defined if initial setup is required. These functions will be called once before
208the first call to `fuzzerTestOneInput`.
209
210The optional function `public static void fuzzerTearDown()` will be run just before the JVM is shut down.
211
Fabian Meumertzheime51010f2021-03-08 16:01:14 +0100212#### Kotlin
213
214An example of a Kotlin fuzz target can be found in
215[KlaxonFuzzer.kt](https://github.com/CodeIntelligenceTesting/jazzer/tree/main/examples/src/main/java/com/example/KlaxonFuzzer.kt).
216
Fabian Meumertzheim5246e522021-01-29 16:20:19 +0100217### Running the fuzzer
218
219The fuzz target needs to be compiled and packaged into a `.jar` archive. Assuming that this archive is called
220`fuzz_target.jar` and depends on libraries available as `lib1.jar` and `lib2.jar`, fuzzing is started by
221invoking Jazzer with the following arguments:
222
223```bash
224--cp=fuzz_target.jar:lib1.jar:lib2.jar --target_class=com.example.MyFirstFuzzTarget <optional_corpus_dir>
225```
226
Fabian Meumertzheim5b94b7b2021-02-10 15:56:38 +0100227The fuzz target class can optionally be specified by adding it as the value of the `Jazzer-Fuzz-Target-Class` attribute
228in the JAR's manifest. If there is only a single such attribute among all manifests of JARs on the classpath, Jazzer will
229use its value as the fuzz target class.
230
231Bazel produces the correct type of `.jar` from a `java_binary` target with `create_executable = False` and
232`deploy_manifest_lines = ["Jazzer-Fuzz-Target-Class: com.example.MyFirstFuzzTarget"]` by adding the suffix `_deploy.jar`
233to the target name.
Fabian Meumertzheim5246e522021-01-29 16:20:19 +0100234
235### Fuzzed Data Provider
236
237For most non-trivial fuzz targets it is necessary to further process the byte array passed from the fuzzer, for example
238to extract multiple values or convert the input into a valid `java.lang.String`. We provide functionality similar to
239[atheris'](https://github.com/google/atheris) `FuzzedDataProvider` and libFuzzer's `FuzzedDataProvider.h` to simplify
240the task of writing JVM fuzz targets.
241
Fabian Meumertzheim541c5c62021-02-23 18:00:48 +0100242If the function `public static void fuzzerTestOneInput(FuzzedDataProvider data)` is defined in the fuzz target, it will
simonresch4418e362021-02-10 10:52:31 +0100243be passed an object implementing `com.code_intelligence.jazzer.api.FuzzedDataProvider` that allows _consuming_ the raw fuzzer
Fabian Meumertzheim5246e522021-01-29 16:20:19 +0100244input as values of common types. This can look as follows:
245
246```java
247package com.example.MySecondFuzzTarget;
248
simonresch4418e362021-02-10 10:52:31 +0100249import com.code_intelligence.jazzer.api.FuzzedDataProvider;
Fabian Meumertzheim5246e522021-01-29 16:20:19 +0100250
Fabian Meumertzheim1b931fd2021-03-04 17:45:53 +0100251public class MySecondFuzzTarget {
Fabian Meumertzheim47fbf242021-03-04 18:07:11 +0100252 public static void callApi(int val, String text) {
Fabian Meumertzheim5246e522021-01-29 16:20:19 +0100253 ...
254 }
255
Fabian Meumertzheim541c5c62021-02-23 18:00:48 +0100256 public static void fuzzerTestOneInput(FuzzedDataProvider data) {
Fabian Meumertzheim5246e522021-01-29 16:20:19 +0100257 callApi1(data.consumeInt(), data.consumeRemainingAsString());
Fabian Meumertzheim5246e522021-01-29 16:20:19 +0100258 }
259}
260```
261
262The `FuzzedDataProvider` interface definition is contained in `jazzer_api_deploy.jar` in the binary release and can be
Fabian Meumertzheimda6f8a82021-02-22 08:20:24 +0100263built by the Bazel target `//agent:jazzer_api_deploy.jar`. It is also available from
264[Maven Central](https://search.maven.org/search?q=g:com.code-intelligence%20a:jazzer-api).
265For additional information, see the
Fabian Meumertzheim5246e522021-01-29 16:20:19 +0100266[javadocs](https://codeintelligencetesting.github.io/jazzer-api/com/code_intelligence/jazzer/api/FuzzedDataProvider.html).
267
268It is highly recommended to use `FuzzedDataProvider` for generating `java.lang.String` objects inside the fuzz target
269instead of converting the raw byte array to directly via a `String` constructor as the `FuzzedDataProvider` implementation is
270engineered to minimize copying and generate both valid and invalid ASCII-only and Unicode strings.
271
Khaled Yakdanbffc1b32021-10-21 10:12:18 +0200272### Autofuzz mode
273
274The Autofuzz mode enables fuzzing arbitrary methods without having to manually create fuzz targets.
275Instead, Jazzer will attempt to generate suitable and varied inputs to a specified methods using only public API functions available on the classpath.
276
277To use Autofuzz, specify the `--autofuzz` flag and provide a fully qualified method reference, e.g.:
278```
279--autofuzz=org.apache.commons.imaging.Imaging::getBufferedImage
280```
281If there are multiple overloads and you want Jazzer to only fuzz one, you can optionally specify the signature of the method to fuzz:
282```
283--autofuzz=org.apache.commons.imaging.Imaging::getBufferedImage(java.io.InputStream,java.util.Map)
284```
285The format of the signature agrees with that obtained from the part after the `#` of the link to the Javadocs for the particular method.
286
287Under the hood, jazzer tries various ways of creating objects from the fuzzer input. For example, if a parameter is an
288interface or an abstract class, it will look for all concrete implementing classes on the classpath.
289Jazzer can also create objects from classes that follow the [builder design pattern](https://www.baeldung.com/creational-design-patterns#builder)
290or have a default constructor and use setters to set the fields.
291
292Creating objects from fuzzer input can lead to many reported exceptions.
293Jazzer addresses this issue by ignoring exceptions that the target method declares to throw.
294In addition to that, you can provide a list of exceptions to be ignored during fuzzing via the `--autofuzz_ignore` flag in the form of a comma-separated list.
295You can specify concrete exceptions (e.g., `java.lang.NullPointerException`), in which case also subclasses of these exception classes will be ignored, or glob patterns to ignore all exceptions in a specific package (e.g. `java.lang.*` or `com.company.**`).
296
297When fuzzing with `--autofuzz`, Jazzer automatically enables the `--keep_going` mode to keep fuzzing indefinitely after the first finding.
298Set `--keep_going=N` explicitly to stop after the `N`-th finding.
299
300#### Docker
301To facilitate using the Autofuzz mode, there is a docker image that you can use to fuzz libraries just by providing their Maven coordinates.
302The dependencies will then be downloaded and autofuzzed:
303
304```sh
305docker run cifuzz/jazzer-autofuzz <Maven coordinates> --autofuzz=<method reference> <further arguments>
306```
307
308As an example, you can autofuzz the `json-sanitizer` library as follows:
309```sh
310docker run -it cifuzz/jazzer-autofuzz \
311 com.mikesamuel:json-sanitizer:1.2.0 \
312 com.google.json.JsonSanitizer::sanitize \
313 --autofuzz_ignore=java.lang.ArrayIndexOutOfBoundsException \
314 --keep_going=1
315```
316
317####
318
Fabian Meumertzheim5246e522021-01-29 16:20:19 +0100319### Reproducing a bug
320
321When Jazzer manages to find an input that causes an uncaught exception or a failed assertion, it prints a Java
322stack trace and creates two files that aid in reproducing the crash without Jazzer:
323
324* `crash-<sha1_of_input>` contains the raw bytes passed to the fuzz target (just as with libFuzzer C/C++ fuzz targets).
325 The crash can be reproduced with Jazzer by passing the path to the crash file as the only positional argument.
326* `Crash-<sha1_of_input>.java` contains a class with a `main` function that invokes the fuzz target with the
327 crashing input. This is especially useful if using `FuzzedDataProvider` as the raw bytes of the input do not
328 directly correspond to the values consumed by the fuzz target. The `.java` file can be compiled with just
329 the fuzz target and its dependencies in the classpath (plus `jazzer_api_deploy.jar` if using `FuzzedDataProvider).
330
331### Minimizing a crashing input
332
333Every crash stack trace is accompanied by a `DEDUP_TOKEN` that uniquely identifies the relevant parts of the stack
334trace. This value is used by libFuzzer while minimizing a crashing input to ensure that the smaller inputs reproduce
335the "same" bug. To minimize a crashing input, execute Jazzer with the following arguments in addition to `--cp` and
336`--target_class`:
337
338```bash
339-minimize_crash=1 <path/to/crashing_input>
340```
341
342### Parallel execution
343
344libFuzzer offers the `-fork=N` and `-jobs=N` flags for parallel fuzzing, both of which are also supported by Jazzer.
345
346### Limitations
347
348Jazzer currently maintains coverage information in a global variable that is shared among threads. This means that while
349fuzzing multi-threaded fuzz targets is theoretically possible, the reported coverage information may be misleading.
350
Fabian Meumertzheim3e0e4f12021-09-15 16:17:54 +0200351## Findings
352
353Jazzer has so far uncovered the following vulnerabilities and bugs:
354
355| Project | Bug | Status | CVE | found by |
356| ------- | -------- | ------ | --- | -------- |
357| [jhy/jsoup](https://github.com/jhy/jsoup) | More than 19 Bugs found in HTML and XML parser | [fixed](https://github.com/jhy/jsoup/security/advisories/GHSA-m72m-mhq2-9p6c) | [CVE-2021-37714](https://nvd.nist.gov/vuln/detail/CVE-2021-37714) | [Code Intelligence](https://code-intelligence.com) |
358| [Apache/commons-compress](https://commons.apache.org/proper/commons-compress/) | Infinite loop when loading a crafted 7z | fixed | [CVE-2021-35515](https://cve.mitre.org/cgi-bin/cvename.cgi?name=CVE-2021-35515) | [Code Intelligence](https://code-intelligence.com) |
359| [Apache/commons-compress](https://commons.apache.org/proper/commons-compress/) | `OutOfMemoryError` when loading a crafted 7z | fixed | [CVE-2021-35516](https://cve.mitre.org/cgi-bin/cvename.cgi?name=CVE-2021-35516) | [Code Intelligence](https://code-intelligence.com) |
360| [Apache/commons-compress](https://commons.apache.org/proper/commons-compress/) | Infinite loop when loading a crafted TAR | fixed | [CVE-2021-35517](https://cve.mitre.org/cgi-bin/cvename.cgi?name=CVE-2021-35517) | [Code Intelligence](https://code-intelligence.com) |
361| [Apache/commons-compress](https://commons.apache.org/proper/commons-compress/) | `OutOfMemoryError` when loading a crafted ZIP | fixed | [CVE-2021-36090](https://cve.mitre.org/cgi-bin/cvename.cgi?name=CVE-2021-36090) | [Code Intelligence](https://code-intelligence.com) |
362| [Apache/PDFBox](https://pdfbox.apache.org/) | Infinite loop when loading a crafted PDF | fixed | [CVE-2021-27807](https://cve.mitre.org/cgi-bin/cvename.cgi?name=2021-27807) | [Code Intelligence](https://code-intelligence.com) |
363| [Apache/PDFBox](https://pdfbox.apache.org/) | OutOfMemoryError when loading a crafted PDF | fixed | [CVE-2021-27906](https://cve.mitre.org/cgi-bin/cvename.cgi?name=2021-27906) | [Code Intelligence](https://code-intelligence.com) |
364| [netplex/json-smart-v1](https://github.com/netplex/json-smart-v1) <br/> [netplex/json-smart-v2](https://github.com/netplex/json-smart-v2) | `JSONParser#parse` throws an undeclared exception | [fixed](https://github.com/netplex/json-smart-v2/issues/60) | [CVE-2021-27568](https://cve.mitre.org/cgi-bin/cvename.cgi?name=CVE-2021-27568) | [@GanbaruTobi](https://github.com/GanbaruTobi) |
365| [OWASP/json-sanitizer](https://github.com/OWASP/json-sanitizer) | Output can contain`</script>` and `]]>`, which allows XSS | [fixed](https://groups.google.com/g/json-sanitizer-support/c/dAW1AeNMoA0) | [CVE-2021-23899](https://cve.mitre.org/cgi-bin/cvename.cgi?name=2021-23899) | [Code Intelligence](https://code-intelligence.com) |
366| [OWASP/json-sanitizer](https://github.com/OWASP/json-sanitizer) | Output can be invalid JSON and undeclared exceptions can be thrown | [fixed](https://groups.google.com/g/json-sanitizer-support/c/dAW1AeNMoA0) | [CVE-2021-23900](https://cve.mitre.org/cgi-bin/cvename.cgi?name=2021-23900) | [Code Intelligence](https://code-intelligence.com) |
367| [alibaba/fastjon](https://github.com/alibaba/fastjson) | `JSON#parse` throws undeclared exceptions | [fixed](https://github.com/alibaba/fastjson/issues/3631) | | [Code Intelligence](https://code-intelligence.com) |
368| [Apache/commons-compress](https://commons.apache.org/proper/commons-compress/) | Infinite loop and `OutOfMemoryError` in `TarFile` | [fixed](https://issues.apache.org/jira/browse/COMPRESS-569) | | [Code Intelligence](https://code-intelligence.com) |
369| [Apache/commons-compress](https://commons.apache.org/proper/commons-compress/) | `NullPointerException` in `ZipFile`| [fixed](https://issues.apache.org/jira/browse/COMPRESS-568) | | [Code Intelligence](https://code-intelligence.com) |
370| [Apache/commons-imaging](https://commons.apache.org/proper/commons-imaging/) | Parsers for multiple image formats throw undeclared exceptions | [reported](https://issues.apache.org/jira/browse/IMAGING-279?jql=project%20%3D%20%22Commons%20Imaging%22%20AND%20reporter%20%3D%20Meumertzheim%20) | | [Code Intelligence](https://code-intelligence.com) |
371| [Apache/PDFBox](https://pdfbox.apache.org/) | Various undeclared exceptions | [fixed](https://issues.apache.org/jira/browse/PDFBOX-5108?jql=project%20%3D%20PDFBOX%20AND%20reporter%20in%20(Meumertzheim)) | | [Code Intelligence](https://code-intelligence.com) |
372| [cbeust/klaxon](https://github.com/cbeust/klaxon) | Default parser throws runtime exceptions | [fixed](https://github.com/cbeust/klaxon/pull/330) | | [Code Intelligence](https://code-intelligence.com) |
373| [FasterXML/jackson-dataformats-binary](https://github.com/FasterXML/jackson-dataformats-binary) | `CBORParser` throws an undeclared exception due to missing bounds checks when parsing Unicode | [fixed](https://github.com/FasterXML/jackson-dataformats-binary/issues/236) | | [Code Intelligence](https://code-intelligence.com) |
374| [FasterXML/jackson-dataformats-binary](https://github.com/FasterXML/jackson-dataformats-binary) | `CBORParser` throws an undeclared exception on dangling arrays | [fixed](https://github.com/FasterXML/jackson-dataformats-binary/issues/240) | | [Code Intelligence](https://code-intelligence.com) |
375| [ngageoint/tiff-java](https://github.com/ngageoint/tiff-java) | `readTiff ` Index Out Of Bounds | [fixed](https://github.com/ngageoint/tiff-java/issues/38) | | [@raminfp](https://github.com/raminfp) |
Fabian Meumertzheimda63aef2021-10-18 18:24:43 +0200376| [google/re2j](https://github.com/google/re2j) | `NullPointerException` in `Pattern.compile` | [reported](https://github.com/google/re2j/issues/148) | | [@schirrmacher](https://github.com/schirrmacher) |
David Korczynski1571e292021-12-09 18:30:16 +0000377| [google/gson](https://github.com/google/gson) | `ArrayIndexOutOfBounds` in `ParseString` | [fixed](https://bugs.chromium.org/p/oss-fuzz/issues/detail?id=40838) | | [@DavidKorczynski](https://twitter.com/Davkorcz) |
Fabian Meumertzheim3e0e4f12021-09-15 16:17:54 +0200378
379As Jazzer is used to fuzz JVM projects in OSS-Fuzz, an additional list of bugs can be found [on the OSS-Fuzz issue tracker](https://bugs.chromium.org/p/oss-fuzz/issues/list?q=proj%3A%22json-sanitizer%22%20OR%20proj%3A%22fastjson2%22%20OR%20proj%3A%22jackson-core%22%20OR%20proj%3A%22jackson-dataformats-binary%22%20OR%20proj%3A%22jackson-dataformats-xml%22%20OR%20proj%3A%22apache-commons%22%20OR%20proj%3A%22jsoup%22&can=1).
380
381If you find bugs with Jazzer, we would like to hear from you!
382Feel free to [open an issue](https://github.com/CodeIntelligenceTesting/jazzer/issues/new) or submit a pull request.
383
Fabian Meumertzheim5246e522021-01-29 16:20:19 +0100384## Advanced Options
385
386Various command line options are available to control the instrumentation and fuzzer execution. Since Jazzer is a
387libFuzzer-compiled binary, all positional and single dash command-line options are parsed by libFuzzer. Therefore, all
388Jazzer options are passed via double dash command-line flags, i.e., as `--option=value` (note the `=` instead of a space).
389
390A full list of command-line flags can be printed with the `--help` flag. For the available libFuzzer options please refer
391to [its documentation](https://llvm.org/docs/LibFuzzer.html) for a detailed description.
392
Fabian Meumertzheimfdc1c242021-11-22 10:51:21 +0100393### Passing JVM arguments
394
395Arguments for the JVM started by Jazzer can be supplied via the `--jvm_args` argument.
396Multiple arguments are delimited by the classpath separator, which is `;` on Windows and `:` else.
397For example, to enable preview features as well as set a maximum heap size, add the following to the Jazzer invocation:
398
399```bash
400# Windows
401--jvm_args=--enable-preview;-Xmx1000m
402# Linux & macOS
403--jvm_args=--enable-preview:-Xmx1000m
404```
405
Fabian Meumertzheim5246e522021-01-29 16:20:19 +0100406### Coverage Instrumentation
407
408The Jazzer agent inserts coverage markers into the JVM bytecode during class loading. libFuzzer uses this information
409to guide its input mutations towards increased coverage.
410
411It is possible to restrict instrumentation to only a subset of classes with the `--instrumentation_includes` flag. This
412is especially useful if coverage inside specific packages is of higher interest, e.g., the user library under test rather than an
413external parsing library in which the fuzzer is likely to get lost. Similarly, there is `--instrumentation_excludes` to
414exclude specific classes from instrumentation. Both flags take a list of glob patterns for the java class name separated
415by colon:
416
417```bash
418--instrumentation_includes=com.my_com.**:com.other_com.** --instrumentation_excludes=com.my_com.crypto.**
419```
420
421By default, JVM-internal classes and Java as well as Kotlin standard library classes are not instrumented, so these do not
422need to be excluded manually.
423
424### Trace Instrumentation
425
426The agent adds additional hooks for tracing compares, integer divisions, switch statements and array indices.
427These hooks correspond to [clang's data flow hooks](https://clang.llvm.org/docs/SanitizerCoverage.html#tracing-data-flow).
428The particular instrumentation types to apply can be specified using the `--trace` flag, which accepts the following values:
429
430* `cov`: AFL-style edge coverage
431* `cmp`: compares (int, long, String) and switch cases
432* `div`: divisors in integer divisions
433* `gep`: constant array indexes
Fabian Meumertzheim14ada372021-02-22 07:21:52 +0100434* `indir`: call through `Method#invoke`
Fabian Meumertzheim3daebce2021-12-10 22:24:36 +0100435* `all`: shorthand to apply all available instrumentations (except `gep`)
Fabian Meumertzheim5246e522021-01-29 16:20:19 +0100436
437Multiple instrumentation types can be combined with a colon.
438
439### Value Profile
440
441The run-time flag `-use_value_profile=1` enables [libFuzzer's value profiling mode](https://llvm.org/docs/LibFuzzer.html#value-profile).
442When running with this flag, the feedback about compares and constants received from Jazzer's trace instrumentation is
443associated with the particular bytecode location and used to provide additional coverage instrumentation.
444See [ExampleValueProfileFuzzer.java](https://github.com/CodeIntelligenceTesting/jazzer/tree/main/examples/src/main/java/com/example/ExampleValueProfileFuzzer.java)
445for a fuzz target that would be very hard to fuzz without value profile.
446
Fabian Meumertzheim74e46fb2021-02-22 08:45:29 +0100447As passing the bytecode location back to libFuzzer requires inline assembly and may thus not be fully portable, it can be disabled
Fabian Meumertzheim5246e522021-01-29 16:20:19 +0100448via the flag `--nofake_pcs`.
449
450### Custom Hooks
451
452In order to obtain information about data passed into functions such as `String.equals` or `String.startsWith`, Jazzer
453hooks invocations to these methods. This functionality is also available to fuzz targets, where it can be used to implement
454custom sanitizers or stub out methods that block the fuzzer from progressing (e.g. checksum verifications or random number generation).
455See [ExampleFuzzerHooks.java](https://github.com/CodeIntelligenceTesting/jazzer/tree/main/examples/src/main/java/com/example/ExampleFuzzerHooks.java)
Fabian Meumertzheim46c87ab2021-03-25 17:02:34 +0100456for an example of such a hook. An example for a sanitizer can be found in
457[ExamplePathTraversalFuzzerHooks.java](https://github.com/CodeIntelligenceTesting/jazzer/tree/main/examples/src/main/java/com/example/ExamplePathTraversalFuzzerHooks.java).
Fabian Meumertzheim5246e522021-01-29 16:20:19 +0100458
459Method hooks can be declared using the `@MethodHook` annotation defined in the `com.code_intelligence.jazzer.api` package,
460which is contained in `jazzer_api_deploy.jar` (binary release) or built by the target `//agent:jazzer_api_deploy.jar` (Bazel).
Fabian Meumertzheimda6f8a82021-02-22 08:20:24 +0100461It is also available from
462[Maven Central](https://search.maven.org/search?q=g:com.code-intelligence%20a:jazzer-api).
Fabian Meumertzheim5246e522021-01-29 16:20:19 +0100463See the [javadocs of the `@MethodHook` API](https://codeintelligencetesting.github.io/jazzer-api/com/code_intelligence/jazzer/api/MethodHook.html)
464for more details.
465
466To use the compiled method hooks they have to be available on the classpath provided by `--cp` and can then be loaded by providing the
467flag `--custom_hooks`, which takes a colon-separated list of names of classes to load hooks from.
Fabian Meumertzheim5b94b7b2021-02-10 15:56:38 +0100468This list of custom hooks can alternatively be specified via the `Jazzer-Hook-Classes` attribute in the fuzz target
469JAR's manifest.
Fabian Meumertzheim5246e522021-01-29 16:20:19 +0100470
471### Suppressing stack traces
472
473With the flag `--keep_going=N` Jazzer continues fuzzing until `N` unique stack traces have been encountered.
474
475Particular stack traces can also be ignored based on their `DEDUP_TOKEN` by passing a comma-separated list of tokens
476via `--ignore=<token_1>,<token2>`.
477
Fabian Meumertzheim3e0e4f12021-09-15 16:17:54 +0200478## Advanced fuzz targets
Fabian Meumertzheim5246e522021-01-29 16:20:19 +0100479
480### Fuzzing with Native Libraries
481
482Jazzer supports fuzzing of native libraries loaded by the JVM, for example via `System.load()`. For the fuzzer to get
483coverage feedback, these libraries have to be compiled with `-fsanitize=fuzzer-no-link`.
484
Fabian Meumertzheim224e8d02021-06-09 10:03:53 +0200485Additional sanitizers such as AddressSanitizer or UndefinedBehaviorSanitizer are often desirable to uncover bugs inside
486the native libraries. The required compilation flags for native libraries are as follows:
487 - *AddressSanitizer*: `-fsanitize=fuzzer-no-link,address`
488 - *UndefinedBehaviorSanitizer*: `-fsanitize=fuzzer-no-link,undefined` (add `-fno-sanitize-recover=all` to crash on UBSan reports)
Fabian Meumertzheim5246e522021-01-29 16:20:19 +0100489
Fabian Meumertzheim224e8d02021-06-09 10:03:53 +0200490Then, use the appropriate driver `//:jazzer_asan` or `//:jazzer_ubsan`.
Fabian Meumertzheim5246e522021-01-29 16:20:19 +0100491
Fabian Meumertzheim224e8d02021-06-09 10:03:53 +0200492**Note:** Sanitizers other than AddressSanitizer and UndefinedBehaviorSanitizer are not yet supported.
493Furthermore, due to the nature of the JVM's GC, LeakSanitizer reports too many false positives to be useful and is thus disabled.
494
495The fuzz targets `ExampleFuzzerWithNativeASan` and `ExampleFuzzerWithNativeUBSan` in the `examples/` directory contain
496minimal working examples for fuzzing with native libraries. Also see `TurboJpegFuzzer` for a real-world example.
Fabian Meumertzheim5246e522021-01-29 16:20:19 +0100497
498### Fuzzing with Custom Mutators
499
500LibFuzzer API offers two functions to customize the mutation strategy which is especially useful when fuzzing functions
501that require structured input. Jazzer does not define `LLVMFuzzerCustomMutator` nor `LLVMFuzzerCustomCrossOver` and
502leaves the mutation strategy entirely to libFuzzer. However, custom mutators can easily be integrated by
503compiling a mutator library which defines `LLVMFuzzerCustomMutator` (and optionally `LLVMFuzzerCustomCrossOver`) and
504pre-loading the mutator library:
505
506```bash
507# Using Bazel:
508LD_PRELOAD=libcustom_mutator.so ./bazelisk-linux-amd64 run //:jazzer -- <arguments>
509# Using the binary release:
510LD_PRELOAD=libcustom_mutator.so ./jazzer <arguments>
511```
512
513## Credit
514
515The following developers have contributed to Jazzer:
516
517[Sergej Dechand](https://github.com/serj),
518[Christian Hartlage](https://github.com/dende),
519[Fabian Meumertzheim](https://github.com/fmeum),
520[Sebastian Pöplau](https://github.com/sebastianpoeplau),
521[Mohammed Qasem](https://github.com/mohqas),
522[Simon Resch](https://github.com/simonresch),
523[Henrik Schnor](https://github.com/henrikschnor),
524[Khaled Yakdan](https://github.com/kyakdan)
525
Fabian Meumertzheime71eb4b2021-02-17 11:59:19 +0100526The LLVM-style edge coverage instrumentation for JVM bytecode used by Jazzer relies on [JaCoCo](https://github.com/jacoco/jacoco).
527Previously, Jazzer used AFL-style coverage instrumentation as pioneered by [kelinci](https://github.com/isstac/kelinci).
Fabian Meumertzheim5246e522021-01-29 16:20:19 +0100528
529<p align="center">
530<a href="https://www.code-intelligence.com"><img src="https://www.code-intelligence.com/hubfs/Logos/CI%20Logos/CI_Header_GitHub_quer.jpeg" height=50px alt="Code Intelligence logo"></a>
531</p>