Reimplement fuzzing instrumentation using Bazel transitions. (#86)

* Reimplement fuzzing instrumentation using Bazel transitions.

This approach eliminates the need for inlining the instrumentation options in the bazelrc file and simplifies the adoption of the rules.

* Updated the documentation, too.

* Fix tagging issues with the new rule separation.

* Updated again the documentation.

* Fixed CI tests.

* Revamped the presubmit tests to include richer smoke testing behavior.

* Installing Honggfuzz deps in the smoke test workflow.

* Exclude some MSAN smoke tests.

* Added code documentation for the cc_engine_sanitizer values.

* Address reviewer comments.

* Make buildifier happy.

* Rename msan-repro sanitizer option to msan-origin-tracking.
10 files changed
tree: 97de001970e084ee7406296bdb2dce81cb727a9e
  1. .github/
  2. docs/
  3. examples/
  4. fuzzing/
  5. .bazelrc
  6. .gitignore
  7. BUILD
  8. CODEOWNERS
  9. honggfuzz.BUILD
  10. LICENSE
  11. README.md
  12. update_docs.sh
  13. WORKSPACE
README.md

Bazel Rules for Fuzz Tests

Overview

This repository contains Bazel Starlark extensions for defining fuzz tests in Bazel projects.

Fuzzing is an effective technique for uncovering security and stability bugs in software. Fuzzing works by invoking the code under test (e.g., a library API) with automatically generated data, and observing its execution to discover incorrect behavior, such as memory corruption or failed invariants. Covering fuzzing in detail is outside the scope of this document. Read more here about fuzzing best practices, additional examples, and other resources.

This rule library provides support for writing in-process fuzz tests, which consist of a driver function that receives a generated input string and feeds it to the API under test. To make a complete fuzz test executable, the driver is linked with a fuzzing engine, which implements the test generation logic. The rule library provides out-of-the-box support for the most popular fuzzing engines (e.g., libFuzzer and Honggfuzz), and an extension mechanism to define new fuzzing engines.

The goal of the fuzzing rules is to provide an easy-to-use interface for developers to specify, build, and run fuzz tests, without worrying about the details of each fuzzing engine. A fuzzing rule wraps a raw fuzz test executable and provides additional tools, such as the specification of a corpus and dictionary and a launcher that knows how to invoke the fuzzing engine with the appropriate set of flags.

The rule library currently provides support for C++ fuzz tests. Support for additional languages may be added in the future.

Prerequisites

C++ fuzz tests require a Clang compiler. The libFuzzer engine requires at least Clang 6.0.

In addition, the Honggfuzz engine requires the libunwind-dev and libblocksruntime-dev packages.

Getting started

The fastest way to get a sense of the fuzzing rules is through the examples provided in this repository. Assuming the current directory points to a local clone of this repository, let's explore some of the features provided by the Bazel rules.

Defining fuzz tests

A fuzz test is specified using a cc_fuzz_test rule. In the most basic form, a fuzz test requires a source file that implements the fuzz driver entry point. Let's consider a simple example that fuzzes the RE2 regular expression library:

# BUILD file.

load("@rules_fuzzing//fuzzing:cc_deps.bzl", "cc_fuzz_test")

cc_fuzz_test(
    name = "re2_fuzz_test",
    srcs = ["re2_fuzz_test.cc"],
    deps = [
        "@re2",
    ],
)

The fuzz driver implements the special LLVMFuzzerTestOneInput function that receives the fuzzer-generated string and uses it to drive the API under test:

// Implementation file.

#include <cstdint>
#include <cstddef>
#include <string>

#include "re2/re2.h"

extern "C" int LLVMFuzzerTestOneInput(const uint8_t* data, size_t size) {
    RE2 re(std::string(reinterpret_cast<const char*>(data), size), RE2::Quiet);
    return 0;
}

Building and running

To build a fuzz test, you need to specify which fuzzing engine and what instrumentation to use for tracking errors during the execution of the fuzzer. Let's build the RE2 fuzz test using libFuzzer and the Address Sanitizer (ASAN) instrumentation, which catches memory errors such as buffer overflows and use-after-frees:

$ bazel build -c opt --config=asan-libfuzzer //examples:re2_fuzz_test

You can directly invoke this fuzz test executable if you know libFuzzer‘s command line interface. But in practice, you don’t have to. For each fuzz test <name>, the rules library generates a number of additional targets that provide higher-level functionality to simplify the interaction with the fuzz test.

One such target is <name>_run, which provides a simple engine-agnostic interface for invoking fuzz tests. Let's run our libFuzzer example:

$ bazel run -c opt --config=asan-libfuzzer //examples:re2_fuzz_test_run

The fuzz test will start running locally, and write the generated tests under a temporary path under /tmp/fuzzing. By default, the generated tests persist across runs, in order to make it easy to stop and resume runs (possibly under different engines and configurations).

Let's interrupt the fuzz test execution (Ctrl-C), and resume it using the Honggfuzz engine:

$ bazel run -c opt --config=asan-honggfuzz //examples:re2_fuzz_test_run

The <name>_run target accepts a number of engine-agnostic flags. For example, the following command runs the fuzz test with an execution timeout and on a clean slate (removing any previously generated tests). Note the extra -- separator between Bazel's own flags and the launcher flags:

$ bazel run -c opt --config=asan-libfuzzer //examples:re2_fuzz_test_run \
      -- --clean --timeout_secs=30

Additional examples

Check out the examples/ directory, which showcases additional features of the cc_fuzz_test rule.

Using the rules in your project

To use the fuzzing rules in your project, you will need to load and set them up in your workspace. We also recommend creating --config commands in your .bazelrc file for the fuzzing engine + sanitizer configurations you wish to use in your project.

Configuring the WORKSPACE

Add the following to your WORKSPACE file:

load("@bazel_tools//tools/build_defs/repo:http.bzl", "http_archive")

http_archive(
    name = "rules_fuzzing",
    sha256 = "a1cde2a5ccc05bdeb75bd0f4c62c6df966134a50278492468bd03ea8ffcaa133",
    strip_prefix = "rules_fuzzing-4de19aafba32cd586abf1bd66ebd3f8d2ea98350",
    urls = ["https://github.com/bazelbuild/rules_fuzzing/archive/4de19aafba32cd586abf1bd66ebd3f8d2ea98350.zip"],
)

load("@rules_fuzzing//fuzzing:repositories.bzl", "rules_fuzzing_dependencies")

rules_fuzzing_dependencies()

load("@rules_fuzzing//fuzzing:dependency_imports.bzl", "fuzzing_dependency_imports")

fuzzing_dependency_imports()

The project is still under active development, so you many need to change the urls and sha256 attributes to get the latest features implemented at HEAD.

Configuring the .bazelrc file

Each fuzz test is built with a fuzzing engine and instrumentation specified in three build settings, available as flags on the Bazel command line:

  • --@rules_fuzzing//fuzzing:cc_engine points to the cc_fuzzing_engine target of the fuzzing engine to use.
  • --@rules_fuzzing//fuzzing:cc_engine_instrumentation specifies the compiler instrumentation to use (for example, libFuzzer or Honggfuzz).
  • --@rules_fuzzing//fuzzing:cc_engine_sanitizer specifies the sanitizer configuration used to detect bugs (for example, ASAN or MSAN).

To simplify specifying these settings on the command line, we recommend combining them as --config settings in your project's .bazelrc file. You can copy and paste the .bazelrc file of this repository as a starting point, which defines the following configurations:

ConfigurationFuzzing engineSanitizer
--config=asan-fuzzerlibFuzzerAddress Sanitizer (ASAN)
--config=msan-fuzzerlibFuzzerMemory Sanitizer (MSAN)
--config=asan-honggfuzzHonggfuzzAddress Sanitizer (ASAN)

You should similarly create additional --config entries for any fuzzing engines defined in your own repository.

Defining fuzzing engines

TODO: Fill in the missing documentation here.

A fuzzing engine launcher script receives configuration through the following environment variables:

VariableDescription
FUZZER_BINARYThe path to the fuzz target executable.
FUZZER_TIMEOUT_SECSIf set, a positive integer representing the timeout in seconds for the entire fuzzer run.
FUZZER_IS_REGRESSIONSet to 1 if the fuzzer should run in regression mode (just execute the input tests), or 0 if this is a continuous fuzzer run.
FUZZER_DICTIONARY_PATHIf set, provides a path to a fuzzing dictionary file.
FUZZER_SEED_CORPUS_DIRIf set, provides a directory path to a seed corpus.
FUZZER_OUTPUT_ROOTA writable path that can be used by the fuzzer during its execution (e.g., as a workspace or for generated artifacts). See the variables below for specific categories of output.
FUZZER_OUTPUT_CORPUS_DIRA path under FUZZER_OUTPUT_ROOT where the new generated tests should be stored.
FUZZER_ARTIFACTS_DIRA path under FUZZER_OUTPUT_ROOT where generated crashes and other relevant artifacts should be stored.

Rule reference