commit | e03b32a776da05bf76ea791600dd4257ecf177e4 | [log] [tgz] |
---|---|---|
author | Stefan Bucur <281483+stefanbucur@users.noreply.github.com> | Fri Dec 04 00:31:20 2020 -0500 |
committer | GitHub <noreply@github.com> | Fri Dec 04 00:31:20 2020 -0500 |
tree | 97de001970e084ee7406296bdb2dce81cb727a9e | |
parent | 8cc9b2932a655231922bf4aa69057ccb055ed5de [diff] |
Reimplement fuzzing instrumentation using Bazel transitions. (#86) * Reimplement fuzzing instrumentation using Bazel transitions. This approach eliminates the need for inlining the instrumentation options in the bazelrc file and simplifies the adoption of the rules. * Updated the documentation, too. * Fix tagging issues with the new rule separation. * Updated again the documentation. * Fixed CI tests. * Revamped the presubmit tests to include richer smoke testing behavior. * Installing Honggfuzz deps in the smoke test workflow. * Exclude some MSAN smoke tests. * Added code documentation for the cc_engine_sanitizer values. * Address reviewer comments. * Make buildifier happy. * Rename msan-repro sanitizer option to msan-origin-tracking.
This repository contains Bazel Starlark extensions for defining fuzz tests in Bazel projects.
Fuzzing is an effective technique for uncovering security and stability bugs in software. Fuzzing works by invoking the code under test (e.g., a library API) with automatically generated data, and observing its execution to discover incorrect behavior, such as memory corruption or failed invariants. Covering fuzzing in detail is outside the scope of this document. Read more here about fuzzing best practices, additional examples, and other resources.
This rule library provides support for writing in-process fuzz tests, which consist of a driver function that receives a generated input string and feeds it to the API under test. To make a complete fuzz test executable, the driver is linked with a fuzzing engine, which implements the test generation logic. The rule library provides out-of-the-box support for the most popular fuzzing engines (e.g., libFuzzer and Honggfuzz), and an extension mechanism to define new fuzzing engines.
The goal of the fuzzing rules is to provide an easy-to-use interface for developers to specify, build, and run fuzz tests, without worrying about the details of each fuzzing engine. A fuzzing rule wraps a raw fuzz test executable and provides additional tools, such as the specification of a corpus and dictionary and a launcher that knows how to invoke the fuzzing engine with the appropriate set of flags.
The rule library currently provides support for C++ fuzz tests. Support for additional languages may be added in the future.
C++ fuzz tests require a Clang compiler. The libFuzzer engine requires at least Clang 6.0.
In addition, the Honggfuzz engine requires the libunwind-dev
and libblocksruntime-dev
packages.
The fastest way to get a sense of the fuzzing rules is through the examples provided in this repository. Assuming the current directory points to a local clone of this repository, let's explore some of the features provided by the Bazel rules.
A fuzz test is specified using a cc_fuzz_test
rule. In the most basic form, a fuzz test requires a source file that implements the fuzz driver entry point. Let's consider a simple example that fuzzes the RE2 regular expression library:
# BUILD file. load("@rules_fuzzing//fuzzing:cc_deps.bzl", "cc_fuzz_test") cc_fuzz_test( name = "re2_fuzz_test", srcs = ["re2_fuzz_test.cc"], deps = [ "@re2", ], )
The fuzz driver implements the special LLVMFuzzerTestOneInput
function that receives the fuzzer-generated string and uses it to drive the API under test:
// Implementation file. #include <cstdint> #include <cstddef> #include <string> #include "re2/re2.h" extern "C" int LLVMFuzzerTestOneInput(const uint8_t* data, size_t size) { RE2 re(std::string(reinterpret_cast<const char*>(data), size), RE2::Quiet); return 0; }
To build a fuzz test, you need to specify which fuzzing engine and what instrumentation to use for tracking errors during the execution of the fuzzer. Let's build the RE2 fuzz test using libFuzzer and the Address Sanitizer (ASAN) instrumentation, which catches memory errors such as buffer overflows and use-after-frees:
$ bazel build -c opt --config=asan-libfuzzer //examples:re2_fuzz_test
You can directly invoke this fuzz test executable if you know libFuzzer‘s command line interface. But in practice, you don’t have to. For each fuzz test <name>
, the rules library generates a number of additional targets that provide higher-level functionality to simplify the interaction with the fuzz test.
One such target is <name>_run
, which provides a simple engine-agnostic interface for invoking fuzz tests. Let's run our libFuzzer example:
$ bazel run -c opt --config=asan-libfuzzer //examples:re2_fuzz_test_run
The fuzz test will start running locally, and write the generated tests under a temporary path under /tmp/fuzzing
. By default, the generated tests persist across runs, in order to make it easy to stop and resume runs (possibly under different engines and configurations).
Let's interrupt the fuzz test execution (Ctrl-C), and resume it using the Honggfuzz engine:
$ bazel run -c opt --config=asan-honggfuzz //examples:re2_fuzz_test_run
The <name>_run
target accepts a number of engine-agnostic flags. For example, the following command runs the fuzz test with an execution timeout and on a clean slate (removing any previously generated tests). Note the extra --
separator between Bazel's own flags and the launcher flags:
$ bazel run -c opt --config=asan-libfuzzer //examples:re2_fuzz_test_run \ -- --clean --timeout_secs=30
Check out the examples/
directory, which showcases additional features of the cc_fuzz_test
rule.
To use the fuzzing rules in your project, you will need to load and set them up in your workspace. We also recommend creating --config
commands in your .bazelrc
file for the fuzzing engine + sanitizer configurations you wish to use in your project.
Add the following to your WORKSPACE
file:
load("@bazel_tools//tools/build_defs/repo:http.bzl", "http_archive") http_archive( name = "rules_fuzzing", sha256 = "a1cde2a5ccc05bdeb75bd0f4c62c6df966134a50278492468bd03ea8ffcaa133", strip_prefix = "rules_fuzzing-4de19aafba32cd586abf1bd66ebd3f8d2ea98350", urls = ["https://github.com/bazelbuild/rules_fuzzing/archive/4de19aafba32cd586abf1bd66ebd3f8d2ea98350.zip"], ) load("@rules_fuzzing//fuzzing:repositories.bzl", "rules_fuzzing_dependencies") rules_fuzzing_dependencies() load("@rules_fuzzing//fuzzing:dependency_imports.bzl", "fuzzing_dependency_imports") fuzzing_dependency_imports()
The project is still under active development, so you many need to change the urls
and sha256
attributes to get the latest features implemented at HEAD
.
Each fuzz test is built with a fuzzing engine and instrumentation specified in three build settings, available as flags on the Bazel command line:
--@rules_fuzzing//fuzzing:cc_engine
points to the cc_fuzzing_engine
target of the fuzzing engine to use.--@rules_fuzzing//fuzzing:cc_engine_instrumentation
specifies the compiler instrumentation to use (for example, libFuzzer or Honggfuzz).--@rules_fuzzing//fuzzing:cc_engine_sanitizer
specifies the sanitizer configuration used to detect bugs (for example, ASAN or MSAN).To simplify specifying these settings on the command line, we recommend combining them as --config
settings in your project's .bazelrc
file. You can copy and paste the .bazelrc
file of this repository as a starting point, which defines the following configurations:
Configuration | Fuzzing engine | Sanitizer |
---|---|---|
--config=asan-fuzzer | libFuzzer | Address Sanitizer (ASAN) |
--config=msan-fuzzer | libFuzzer | Memory Sanitizer (MSAN) |
--config=asan-honggfuzz | Honggfuzz | Address Sanitizer (ASAN) |
You should similarly create additional --config
entries for any fuzzing engines defined in your own repository.
TODO: Fill in the missing documentation here.
A fuzzing engine launcher script receives configuration through the following environment variables:
Variable | Description |
---|---|
FUZZER_BINARY | The path to the fuzz target executable. |
FUZZER_TIMEOUT_SECS | If set, a positive integer representing the timeout in seconds for the entire fuzzer run. |
FUZZER_IS_REGRESSION | Set to 1 if the fuzzer should run in regression mode (just execute the input tests), or 0 if this is a continuous fuzzer run. |
FUZZER_DICTIONARY_PATH | If set, provides a path to a fuzzing dictionary file. |
FUZZER_SEED_CORPUS_DIR | If set, provides a directory path to a seed corpus. |
FUZZER_OUTPUT_ROOT | A writable path that can be used by the fuzzer during its execution (e.g., as a workspace or for generated artifacts). See the variables below for specific categories of output. |
FUZZER_OUTPUT_CORPUS_DIR | A path under FUZZER_OUTPUT_ROOT where the new generated tests should be stored. |
FUZZER_ARTIFACTS_DIR | A path under FUZZER_OUTPUT_ROOT where generated crashes and other relevant artifacts should be stored. |