| # FuzzTest Domain Reference |
| |
| [TOC] |
| |
| This document describes the available input domains, and how you can create your |
| own domains, within FuzzTest. The first section lists the set of existing |
| primary domains. A second section lists "combinators" which allow the mixing of |
| two or more primary domains. |
| |
| Note: Note that all APIs described below are in the `fuzztest::` namespace. |
| |
| ## Built-In Domains |
| |
| The following domains are built into FuzzTest as primary domains which you can |
| use out of the box. |
| |
| ### Arbitrary Domains |
| |
| The `Arbitrary<T>()` domain is implemented for all native C++ types and for |
| protocol buffers. Specifically, for the following types: |
| |
| - Boolean type: `bool`. |
| - Character types: `char`, `signed char`, `unsigned char`. |
| - Integral types: `short`, `int`, `unsigned`, `int8_t`, `uint32_t`, `long |
| long`, etc. |
| - Floating types: `float`, `double`, etc. |
| - Enumeration types: `enum`, `enum class` (TBD: b/183016365). |
| - Simple user defined structs. |
| - Tuple types: `std::pair<T1,T2>`, `std::tuple<T,...>`. |
| - Smart pointers: `std::unique_ptr<T>`, `std::shared_ptr<T>`. |
| - Optional types: `std::optional<T>`. |
| - Variant types: `std::variant<T,...>`. |
| - String types: `std::string`, etc. |
| - String view type: `std::string_view`. |
| - Sequence container types: `std::vector<T>`, `std::array<T>`, |
| `std::deque<T>`, `std::list<T>`, etc. |
| - Unordered associative container types: `std::unordered_set`, |
| `absl::flat_hash_set`, `absl::node_hash_set`, `std::unordered_map`, |
| `absl::flat_hash_map`, `absl::node_hash_map`, etc. |
| - Ordered associative container types: `std::set<K>`, `std::map<K,T>`, |
| `std::multiset<K>`, `std::multimap<K,T>`, etc. |
| - Protocol buffer types: `MyProtoMessage`, etc. |
| - [Abseil time library types](https://abseil.io/docs/cpp/guides/time): |
| `absl::Duration`, `absl::Time`. |
| |
| Composite or container types, like `std::optional<T>` or `std::vector<T>`, are |
| supported as long as the inner types are. For example, |
| `Arbitrary<std::vector<T1>>()` is implemented, if `Arbitrary<T1>()` is |
| implemented. The inner elements will be created and mutated via the |
| `Arbitrary<T1>` domain. For example, the `Arbitrary<std::tuple<int, |
| std::string>>()` or the `Arbitrary<std::variant<int, std::string>>()` domain |
| will use `Arbitrary<int>()` and `Arbitrary<std::string>()` as sub-domains. |
| |
| User defined structs must support |
| [aggregate initialization](https://en.cppreference.com/w/cpp/language/aggregate_initialization), |
| must have only public members and no more than 64 fields. |
| |
| Recall that `Arbitrary` is the default input domain, which means that you can |
| fuzz a function like below without a `.WithDomains()` clause: |
| |
| ```c++ |
| void MyProperty(const absl::flat_hash_map<uint32, MyProtoMessage>& m, |
| const std::optional<std::string>& s) { |
| //... |
| } |
| FUZZ_TEST(MySuite, MyProperty); |
| ``` |
| |
| Under the hood, FuzzTest implements each domain as a custom object mutator. |
| These mutators, combined with the underlying coverage-guided fuzzing algorithm, |
| iteratively find values that increase the coverage of the code under test. |
| Beyond that, it also tries "special" values of the given domain. E.g., for |
| arbitrary integer domains, it will try values like `0`, `1`, and |
| `std::numeric_limits<T>::max()`. For floating point domains will try values like |
| `0`, `-0`, `NaN`, and `std::numeric_limits<T>::infinity()`. For container |
| domains it will try empty, small and large containers, and so on. |
| |
| ### Numerical Domains |
| |
| Other than `Arbitrary<int>()`, `Arbitrary<float>()`, etc., we have the following |
| more "restricted" numerical domains: |
| |
| - `InRange(min, max)` represents any value between [`min`, `max`], closed |
| interval. E.g., an arbitrary probability value could be represented with |
| `InRange(0.0, 1.0)`. |
| - `NonZero<T>()` is like `Arbitrary<T>()` without the zero value. |
| - `Positive<T>()` represents numbers greater than zero. |
| - `NonNegative<T>()` represents zero and numbers greater than zero. |
| - `Negative<T>()` represents numbers less than zero. |
| - `NonPositive<T>` represents zero and numbers less than zero. |
| - `Finite<T>` represent floating points numbers that are neither infinity nor |
| NaN. |
| |
| For instance, if your test function has a precondition that the input has to be |
| positive, you can write your FUZZ_TEST like this: |
| |
| ```c++ |
| void MyProperty(int x) { |
| ASSERT(x > 0); |
| // ... |
| } |
| FUZZ_TEST(MySuite, MyProperty).WithDomains(Positive<int>()); |
| ``` |
| |
| ### Character Domains |
| |
| Other than `Arbitrary<char>()`, we have the following more specific character |
| domains: |
| |
| - `InRange(min, max)` can be applied to characters as well, e.g., |
| `InRange('a', 'z')`. |
| - `NonZeroChar()` represents any char except `'\0'`. |
| - `NumericChar()` is alias for `InRange('0', '9')`. |
| - `LowerChar()` is alias for `InRange('a', 'z')`. |
| - `UpperChar()` is alias for `InRange('A', 'Z')`. |
| - `AlphaChar()` is alias for `OneOf(LowerChar(), UpperChar())`. |
| - `AlphaNumericChar()` is alias for `OneOf(AlphaChar(), NumericChar())`. |
| - `PrintableAsciiChar()` represents any printable character |
| (`InRange<char>(32, 126)`). |
| - `AsciiChar()` represents any ASCII character (`InRange<char>(0, 127)`). |
| |
| ### String Domains |
| |
| You can use the following basic string domains: |
| |
| - `String()` is an alias for `Arbitrary<std::string>()`. |
| - `AsciiString()` represents strings of ASCII characters. |
| - `PrintableAsciiString()` represents printable strings. |
| |
| You also define your string domains with custom character domains using the |
| [StringOf()](#string-combinator) domain combinator. |
| |
| ### `InRegexp` Domains |
| |
| You can also use regular expressions to define a string domain. The `InRegexp` |
| domain represents strings that are sentences of a given regular expression. You |
| can use any regular expression syntax |
| [accepted by RE2](https://github.com/google/re2/wiki/Syntax). For example: |
| |
| ```c++ |
| auto DateLikeString() { |
| return InRegexp("[0-9]{4}-[0-9]{2}-[0-9]{2}"); |
| } |
| |
| auto EmailLikeString() { |
| return InRegexp("[a-zA-Z0-9]+@[a-zA-Z0-9]+\\.[a-z]{2,6}*"); |
| } |
| ``` |
| |
| This is useful for testing APIs that required specially formatted strings like |
| email addresses, phone numbers, URLs, etc. Here is an example test for a date |
| parser: |
| |
| ```c++ |
| // Tests with values matching the regexp below (like `08/29/5434`) that the |
| // parser always return true. Note that the regexp doesn't handle leap years. |
| static void ParseFirstDateInStringAlwaysSucceedsForDates( |
| const std::string& date_str) { |
| StringDateParser date_parser(false); |
| SimpleDate output; |
| EXPECT_TRUE(date_parser.ParseFirstDateInString( |
| date_str, i18n_identifiers::language_code::ENGLISH_US(), &output)); |
| } |
| FUZZ_TEST(EnglishLocaleTest, ParseFirstDateInStringAlwaysSucceedsForDates) |
| .WithDomains(fuzztest::InRegexp( |
| "^(0[1-9]|1[012])/(0[1-9]|[12][0-9])/[1-9][0-9]{3}$")); |
| ``` |
| |
| ### `ElementOf` Domains |
| |
| We can also define a domain by explicitly enumerating the set of values in it. |
| You can do this with the `ElementOf` domain, that can be instantiated with a |
| vector of constant values of some type, e.g.: |
| |
| ```c++ |
| auto AnyLittlePig() { |
| return ElementOf<std::string>({"Fifer Pig", "Fiddler Pig", "Practical Pig"}); |
| } |
| |
| auto MagicNumber() { |
| return ElementOf({0xDEADBEEF, 0xBADDCAFE, 0xFEEDFACE}); |
| } |
| ``` |
| |
| The type can be anything, `enum`s too: |
| |
| ```c++ |
| enum Status { kYes, kNo, kMaybe }; |
| |
| auto AnyStatus() { |
| return ElementOf<Status>({kYes, kNo, kMaybe}); |
| } |
| ``` |
| |
| The `ElementOf` domain is often used in combination with other domains, for |
| instance to provide some concrete examples while fuzzing with arbitrary inputs, |
| e.g.: `OneOf(MagicNumber(), Arbitrary<uint32>())`. TODO reference combinations |
| |
| Or it can also be used for a |
| [regular value-parameterized unit tests](https://google.github.io/googletest/advanced.html#value-parameterized-tests): |
| |
| ```c++ |
| void WorksWithAnyPig(const std::string& pig) { |
| EXPECT_TRUE(IsLittle(pig)); |
| } |
| FUZZ_TEST(IsLittlePigTest, WorksWithAnyPig).WithDomains(AnyLittlePig()); |
| ``` |
| |
| ### `BitFlagCombinationOf` Domains |
| |
| The `BitFlagCombinationOf` domain takes a list of binary flags and yields a |
| random combination of them made through bitwise operations (`&`, `^`, etc.). |
| Consider we have the following bitflag values: |
| |
| ```c++ |
| enum Options { |
| kFirst = 1 << 0, |
| kSecond = 1 << 1, |
| kThird = 1 << 2, |
| }; |
| ``` |
| |
| The domain `BitFlagCombinationOf({kFirst, kThird})` will include `{0, kFirst, |
| kThird, kFirst | kThird}`. |
| |
| ### Protocol Buffer Domains |
| |
| You can use the `Arbitrary<T>()` domain with any proto message type or bare |
| proto enum, e.g.: |
| |
| ```c++ |
| void DoingStuffDoesNotCrashWithAnyProto(const ProtoA& msg_a, const ProtoB msg_b) { |
| DoStuff(msg_a, msg_b); |
| } |
| FUZZ_TEST(MySuite, DoingStuffDoesNotCrashWithAnyProto); |
| |
| void DoingStuffDoesNotCrashWithEnumValue(Proto::Enum e) { |
| switch(e) { |
| case Proto::ENUM_ABC: |
| // etc... |
| } |
| } |
| FUZZ_TEST(MySuite, DoingStuffDoesNotCrashWithEnumValue); |
| ``` |
| |
| By default, all fields will use `Arbitrary<U>()` for their values. The |
| exceptions are: |
| |
| * `string` fields which will guarantee UTF8 values. |
| * `enum` fields will select only valid labels. |
| |
| Alternatively, you can use `ProtobufOf` to define a domain for |
| `unique_ptr<Message>` using a protobuf prototype (the default protobuf message). |
| Note that `ProtobufOf` doesn't take `const Message*` (the prototype) directly. |
| It takes a *function pointer* that returns a `const Message*`. This delays |
| getting the prototype until the first use: |
| |
| ```c++ |
| const Message* GetMessagePrototype() { |
| const std::string name = GetPrototypeNameFromFlags(); |
| const Descriptor* descriptor = |
| DescriptorPool::generated_pool()->FindMessageTypeByName(name); |
| return MessageFactory::generated_factory()->GetPrototype(descriptor); |
| } |
| |
| void DoStuffDoesNotCrashWithMyProto(const std::unique_ptr<Message>& my_proto){ |
| DoStuff(my_proto); |
| } |
| FUZZ_TEST(MySuite, DoStuffDoesNotCrashWithMyProto) |
| .WithDomains(ProtobufOf(GetMessagePrototype)); |
| ``` |
| |
| #### Customizing Individual Fields |
| |
| Each proto field has a type (i.e., int32/string) and a rule |
| (optional/repeated/required). You can customize the subdomains for type, rule, |
| or both. |
| |
| **Customizing the field type:** You can customize the subdomains for type used |
| on individual fields by calling `With<Type>Field` method. Consider the following |
| proto: |
| |
| ```proto |
| message Address { |
| string number = 0; |
| string street = 1; |
| string unit = 2; |
| string city = 3; |
| string state = 4; |
| int zipcode = 5; |
| } |
| |
| message Person { |
| optional string ldap = 2; |
| enum Gender { |
| UNKNOWN = 0 |
| FEMALE = 1, |
| MALE = 2, |
| OTHER = 3, |
| } |
| optional Gender gender = 3; |
| optional Address address = 4; |
| } |
| ``` |
| |
| You can customize the domain for each field like the following: |
| |
| ```c++ |
| FUZZ_TEST(MySuite, DoingStuffDoesNotCrashWithCustomProto). |
| WithDomains(Arbitrary<Person>() |
| .WithStringField("ldap", StringOf(AlphaChar())) |
| .WithEnumField("gender", ElementOf<int>({FEMALE, MALE, OTHER})) |
| .WithProtobufField("address", |
| Arbitrary<Address>() |
| .WithInt32Field("zipcode", InRange(10000, 99999)) |
| .WithStringField("state", String().WithSize(2)))); |
| ``` |
| |
| The inner domain is as follows: |
| |
| * For `int32`, `int64`, `uint32`, `uint64`, `bool`, `float`, `double`, and |
| `string` fields the inner domain can be any `Domain<T>` of C++ type |
| `int32_t`, `int64_t`, `uint32_t`, `uint64_t`, `bool`, `float`, `double`, and |
| `std::string` respectively. |
| * For `enum` fields the inner domain is a `Domain<int>`. Note that values that |
| are not valid enums would be stored in the unknown fields set if the field |
| is a closed enum. Open enums would accept any value. The default domain for |
| enum fields only chooses between valid labels. |
| * For `message` fields the inner domain is a |
| `Domain<std::unique_ptr<Message>>`. The domain returned by |
| `Arbitrary<MyProto>()` qualifies. Note that even though it uses |
| `unique_ptr`, a null value is not allowed and will trigger undefined |
| behavior or a runtime assertion of some kind. |
| |
| The field domains are indexed by field name and will be verified at startup. A |
| mismatch between the field names and the inner domains will cause a runtime |
| failure. |
| |
| IMPORTANT: Note that *optional* fields are not always set by the fuzzer. |
| |
| **Customizing the field rule:** You can customize the field rule for optional or |
| repeated fields: |
| |
| * `WithFieldUnset` will keep the field empty. |
| * `WithFieldAlwaysSet` will keep the field non-empty. |
| |
| ```c++ |
| FUZZ_TEST(MySuite, DoingStuffDoesNotCrashWithCustomProto). |
| WithDomains(Arbitrary<MyProto>() |
| .WithFieldUnset("optional_no_val") |
| .WithFieldUnset("repeated_empty") |
| .WithFieldAlwaysSet("optional_has_val") |
| .WithFieldAlwaysSet("repeated_field_non_empty")); |
| ``` |
| |
| In addition, for repeated fields, you can customize its size with |
| `WithRepeatedField[Min|Max]Size`. |
| |
| ```c++ |
| FUZZ_TEST(MySuite, DoingStuffDoesNotCrashWithCustomProto). |
| WithDomains(Arbitrary<MyProto>() |
| .WithRepeatedFieldSize("size1", 1) |
| .WithRepeatedFieldMinSize("size_ge_2", 2) |
| .WithRepeatedFieldMaxSize("size_le_3", 3)); |
| ``` |
| |
| **Customizing the field rule and type:** You can also customize both the rule |
| and type at the same time with `With[Optional|Repeated]<Type>Field`: |
| |
| ```c++ |
| FUZZ_TEST(MySuite, DoingStuffDoesNotCrashWithCustomProto). |
| WithDomains(Arbitrary<MyProto>() |
| // Could be set or unset. |
| .WithOptionalInt64Field("int", OptionalOf(InRange(1l, 10l))) |
| // Is always unset. |
| .WithOptionalUInt32Field("uint32", NullOpt<int32_t>()) |
| // Is always set. |
| .WithOptionalInt32Field("int32", NonNull(InRange(1, 10))) |
| // Is non-empty. |
| .WithRepeatedInt32Field("rep_int", VectorOf(InRange(1, 10)).WithMinSize(1)) |
| // Is vector of unique elements. |
| .WithRepeatedInt64Field("rep_int64", UniqueElementsVectorOf(InRange(1l, 10l)))); |
| ``` |
| |
| For *optional* fields the domain is of the form `Domain<std::optional<Type>>` |
| and for *repeated* fields the domain is of the form `Domain<std::vector<Type>>`. |
| |
| Also, `With<Type>FieldAlwaysSet` is a shorter alternative when one wants to |
| customize non-empty fields: |
| |
| ```c++ |
| // Short form for .WithOptionalIntField("optional_int", NonNull(InRange(1, 10))) |
| .WithIntFieldAlwaysSet("optional_int", InRange(1, 10)) |
| |
| // Short form for .WithRepeatedIntField("repeated_int", VectorOf(InRange(1, 10)).WithMinSize(1)) |
| .WithIntFieldAlwaysSet("repeated_int", InRange(1, 10)) |
| ``` |
| |
| #### Customizing a Subset of Fields |
| |
| You can customize the domain for a subset of fields, for example all fields with |
| message type `Date`, or all fields with "amount" in the field's name. |
| |
| IMPORTANT: Note that customization options can conflict each other. In case of |
| conflicts the latter customization always overrides the former. Moreover, |
| individual field customization discussed in the previous section cannot precede |
| customizations in this section. |
| |
| **Customizing Multiple Fields With Same Type:** You can set the domain for a |
| subset of fields with the same type using `With<Type>Fields`. By default this |
| applies to all fields of Type. You can also provide a filter function to select |
| a subset of fields. Consider the `Moving` proto: |
| |
| ```proto |
| message Address{ |
| optional string line1 = 1; |
| optional string line2 = 2; |
| optional string city = 3; |
| optional State state = 4; |
| optional int32 zipcode = 5; |
| } |
| |
| message Moving{ |
| optional Address from_address = 1; |
| optional Address to_address = 2; |
| optional google.protobuf.Timestamp start_ts = 3; |
| optional google.protobuf.Timestamp deadline_ts = 4; |
| optional google.protobuf.Timestamp finish_ts = 5; |
| optional int32 customer_id = 6; |
| optional int32 distance = 7; |
| optional int32 cost_estimate = 8; |
| optional int32 balance = 9; |
| } |
| ``` |
| |
| Most integer fields should be positive and there are multiple |
| `Timestamp`/`zipcode` fields which require special domains: |
| |
| ```c++ |
| |
| bool IsZipCode(const FieldDescriptor* field) { |
| return field->name() == "zipcode"; |
| } |
| bool IsTimestamp(const FieldDescriptor* field){ |
| return field->message_type()->full_name() == "google.protobuf.Timestamp"; |
| } |
| FUZZ_TEST(MySuite, DoingStuffDoesNotCrashWithCustomProto) |
| .WithDomains(Arbitrary<Moving>() |
| // All int fields should be positive |
| .WithInt32Fields(Positive<int>()) |
| // except balance field which can be negative |
| .WithInt32Field("balance", Arbitrary<int>()) |
| // and except all zipcode fields which should have 5 digits |
| .WithInt32Fields(IsZipcode, InRange(10000, 99999)) |
| // All Timestamp fields should have "nanos" field unset. |
| .WithProtobufFields(IsTimestamp, Arbitrary<Timestamp>().WithFieldUnset("nanos"))); |
| ``` |
| |
| Notice that these filters apply recursively to nested protos as well. You can |
| restrict the filter to optional or repeated fields by using |
| `WithOptional<Type>Fields` or `WithRepeated<Type>Fields`, respectively. |
| |
| **Customizing Rules of Multiple Fields:** You can customize the nullness for a |
| subset of fields using `WithFieldsAlwaysSet`, `WithFieldsUnset`, and filters: |
| |
| ```c++ |
| bool IsProtoType(const FieldDescriptor* field){ |
| return field->cpp_type() == FieldDescriptor::CPPTYPE_MESSAGE; |
| } |
| FUZZ_TEST(MySuite, DoingStuffDoesNotCrashWithCustomProto). |
| WithDomains(Arbitrary<MyProto>() |
| // Always set optional fields, and repeated fields have size > 0, |
| .WithFieldsAlwaysSet() |
| // except fields that contain nested protos, |
| .WithFieldsUnset(IsProtoType) |
| // and except "foo" field. We override the nullness by using the |
| // WithOptionalInt32Filed, which will allow the fuzzer to set and unset |
| // this field. |
| .WithOptionalInt32Field("foo", OptionalOf(Arbitrary<int>())) |
| ); |
| ``` |
| |
| You can restrict the filter to optional or repeated fields by using |
| `WithOptionalFields[Unset|AlwaysSet]` or `WithRepeatedFields[Unset|AlwaysSet]`, |
| respectively. |
| |
| Furthermore, you can customize repeated fields size using |
| `WithRepeatedFields[Min|Max]?Size`: |
| |
| ```c++ |
| bool IsChildrenOfBinaryTree(const FieldDescriptor* field){ |
| return field->containing_type()->full_name() == "my_package.Node" |
| && field->name() == "children"; |
| } |
| FUZZ_TEST(MySuite, DoingStuffDoesNotCrashWithCustomProto). |
| WithDomains(Arbitrary<MyProto>() |
| // Repeated fields should have size in range 1-10 |
| .WithRepeatedFieldsMinSize(1) |
| .WithRepeatedFieldsMaxSize(10) |
| // except children of the inner nodes of a binary tree (has exactly two children). |
| .WithRepeatedFieldsSize(IsChildrenOfBinaryTree, 2) |
| // and except "additional_info" field which can be empty or arbitrary large |
| .WithInt32Field("additional_info", VectorOf(String())) |
| ); |
| ``` |
| |
| Notice that `With[Optional|Repeated]Fields[Unset|AlwaysSet]` and |
| `WithRepeatedFields[Min|Max]?Size` work recursively and apply to subprotos as |
| well, but calling `With[Optional|Repeated]FieldsAlwaysSet` or |
| `WithRepeatedFields[Min]?Size(X)` with `X > 0` on recursively defined protos |
| causes a failure. |
| |
| #### Customizing Oneof Fields |
| |
| You can customize oneof fields similarly as other optional fields. However, the |
| oneof could be unset even if you always set one of its field. Consider the |
| following example: |
| |
| ```proto |
| message Algorithm{ |
| oneof strategy { |
| string strategy_id = 1; |
| int64 legacy_strategy_id = 2; |
| } |
| } |
| ``` |
| |
| Then, you wish to customize the domain and test different non-legacy strategies. |
| The following would not work because the `legacy_strategy_id` could still have |
| values. |
| |
| ```c++ |
| FUZZ_TEST(MySuite, DoingStuffDoesNotCrashWithCustomProto). |
| WithDomains(Arbitrary<Algorithm>() |
| .WithFieldAlwaysSet("strategy_id")); |
| ``` |
| |
| For oneof to always have a value, at least one of its field should be marked as |
| `AlwaysSet` and the rest should be marked `Unset` (for example, when you use |
| `WithFieldsAlwaysSet`). Otherwise, you need to explicitly set it by |
| `WithOneofAlwaysSet`: |
| |
| ```c++ |
| FUZZ_TEST(MySuite, DoingStuffDoesNotCrashWithCustomProto). |
| WithDomains(Arbitrary<Algorithm>() |
| .WithOneofAlwaysSet("strategy") |
| .WithFieldUnset("legacy_strategy_id")); |
| ``` |
| |
| ## What Domains Should You Use for View Types? |
| |
| If your property function takes "view types", such as `std::string_view` or |
| `std::span<T>`, you have multiple options. |
| |
| For a `std::string_view` parameter you can use `std::string` domains, such as |
| `Arbitrary<std::string>()` or `InRegexp("[ab]+")`. The `string`-s created by the |
| domain get implicitly converted to `string_view`-s. Alternatively, you can use |
| `Arbitrary<std::string_view>()` which creates `string_view`-s in the first |
| place, automatically backed by `string` values. This means that in regular |
| value-parameterized unit tests, |
| `.WithDomains()` can be omitted: |
| |
| ```c++ |
| void UnescapeNeverCrashes(std::string_view s) { Unescape(s); } |
| FUZZ_TEST(UnescapeTest, UnescapeNeverCrashes); |
| ``` |
| |
| If you have a `std::span` parameter, you can use a `std::vector` domain, for |
| example: |
| |
| ```c++ |
| void MyProperty(std::span<int> ints) { ... } |
| FUZZ_TEST(MySuite, MyProperty).WithDomains(Arbitrary<std::vector<int>>()); |
| ``` |
| |
| TODO(b/200074418): More native support for view types. |
| |
| ## Domain Combinators |
| |
| Domain combinators let you create more complex domains from simpler ones. |
| |
| ### String Combinator |
| |
| The `StringOf(character_domain)` domain combinator function lets you specify the |
| domain of the characters in `std::string`. For instance, to represent strings |
| that are composed only of specific characters, you can use |
| |
| ```c++ |
| StringOf(OneOf(InRange('a', 'z'), ElementOf({'.', '!', '?'}))) |
| ``` |
| |
| (See [OneOf](#oneof-combinator) combinator and [ElementOf](#elementof-domain) |
| domain.) |
| |
| Another example is the `AsciiString()`, whose implementation is |
| `StringOf(AsciiChar())`. |
| |
| ### Container Combinators |
| |
| You can specify the domain of the *elements* in a container using container |
| combinators. `ContainerOf<T>(elements_domain)` is the generic container |
| combinator, which you can use like this: |
| |
| ```c++ |
| auto VectorOfNumbersBetweenOneAndSix() { |
| auto one_to_six = InRange(1, 6); |
| return ContainerOf<std::vector<int>>(one_to_six); |
| } |
| ``` |
| |
| This domain represents any vector whose elements are numbers between 1 and 6. |
| |
| The previous example can be simplified by dropping `<int>` after `std:vector`, |
| as this can be inferred automatically: |
| |
| ```c++ |
| auto VectorOfNumbersBetweenOneAndSix() { |
| return ContainerOf<std::vector>(InRange(1, 6)); |
| } |
| ``` |
| |
| In particular, if the container type `T` is a class template (e.g. |
| `std::vector`) whose first template parameter is the type of the values stored |
| in the container, and whose other template parameters, if any, are optional, |
| then all the template parameters of `T` may be omitted, in which case |
| `ContainerOf` will use the `value_type` of the `elements_domain` as the first |
| template parameter for `T`. |
| |
| `ContainerOf` is rarely used directly however, as there are more ergonomic |
| shorthands available shown below. |
| |
| #### Shorthands |
| |
| We provide shorthand aliases for the most common container combinator types. |
| E.g., the above example can be written simply as |
| |
| ```c++ |
| VectorOf(InRange(1, 6)) |
| ``` |
| |
| The following shorthand aliases are available: |
| |
| - `VectorOf(inner)` is alias for `ContainerOf<std::vector<T>>(inner)`. |
| |
| - `DequeOf(inner)` is alias for `ContainerOf<std::deque<T>>(inner)`. |
| |
| - `ListOf(inner)` is alias for `ContainerOf<std::list<T>>(inner)`. |
| |
| - `SetOf(inner)` is alias for `ContainerOf<std::set<T>>(inner)`. |
| |
| - `MapOf(key_domain, value_domain)` is alias for |
| `ContainerOf<std::map<K,T>>(PairOf(key_domain, value_domain))`. |
| |
| - `UnorderedSetOf(inner)` is alias for |
| `ContainerOf<std::unordered_set<T>>(inner)`. |
| |
| - `UnorderedMapOf(key_domain, value_domain)` is alias for |
| `ContainerOf<std::unordered_map<K,T>>(PairOf(key_domain, value_domain))`. |
| |
| - `ArrayOf(inner1, ..., innerN)` creates a domain for `std::array<T, N>`, |
| where `N` is the number of inner domains, and where `T` is the value type of |
| every one of the inner domains (i.e. they're all the same). |
| |
| - `ArrayOf<N>(inner)` is alias for `ArrayOf(inner, ..., inner)`, where `N` |
| copies of `inner` are passed to `ArrayOf`. |
| |
| ### Custom Container Size |
| |
| The size of any container domain can be customized using the `WithSize()`, |
| `WithMinSize()` and `WithMaxSize()` setters. |
| |
| For instance, to represent arbitrary integer vectors of size 42, we can use: |
| |
| ```c++ |
| Arbitrary<std::vector<int>>().WithSize(42) |
| ``` |
| |
| This works with container combinators as well, e.g.: |
| |
| ```c++ |
| ContainerOf<std::vector<int>>(InRange(0,10)).WithMinSize(2).WithMaxSize(3) |
| ``` |
| |
| or |
| |
| ```c++ |
| VectorOf(InRange(0,10)).WithMinSize(2).WithMaxSize(3) |
| ``` |
| |
| #### NonEmpty Containers |
| |
| To represent any non-empty container you can use `NonEmpty()`, e.g., |
| |
| ```c++ |
| NonEmpty(Arbitrary<std::vector<int>>()) |
| ``` |
| |
| or |
| |
| ```c++ |
| NonEmpty(VectorOf(String())) |
| ``` |
| |
| The `NonEmpty(domain)` is shorthand for `domain.WithMinSize(1)`. |
| |
| ### Unique Elements Containers |
| |
| Sometimes we need a vector with all unique elements. We can use the |
| `UniqueElementsContainerOf<T>()` combinator to get one. |
| |
| ```c++ |
| UniqueElementsContainerOf<std::vector<int>>(Arbitrary<int>()) |
| ``` |
| |
| or using the shorthand: |
| |
| ```c++ |
| UniqueElementsVectorOf(Arbitrary<int>()) |
| ``` |
| |
| ### Aggregate Combinators |
| |
| Just like with containers, we often need to specify the inner domains of |
| [aggregate data types](https://en.cppreference.com/w/cpp/language/aggregate_initialization). |
| We can do this with various aggregate combinator functions listed in this |
| section. |
| |
| #### StructOf |
| |
| The StructOf combinator function lets you define the domain of each field of a |
| user-defined struct. |
| |
| ```c++ |
| struct Thing { |
| int id; |
| std::string name; |
| }; |
| |
| auto AnyThing() { |
| return StructOf<Thing>(InRange(0, 10), |
| Arbitrary<std::string>()); |
| } |
| ``` |
| |
| #### ConstructorOf |
| |
| The `ConstructorOf<T>()` combinator lets you define a domain for a class T by |
| specifying the domains for T's constructor parameters. For example: |
| |
| ```c++ |
| auto AnyAbslStatus() { |
| return ConstructorOf<absl::Status>( |
| /*status_code:*/ConstructorOf<absl::StatusCode>(InRange(0, 18)), |
| /*message:*/Arbitrary<std::string>()); |
| } |
| ``` |
| |
| #### PairOf |
| |
| The `PairOf` domain represents `std::pair<T1,T2>` of the provided inner domains. |
| For example, the domain: |
| |
| ```c++ |
| PairOf(InRange(0, 10), Arbitrary<std::string>()); |
| ``` |
| |
| provides values of type `std::pair<int, std::string>`, where the first element |
| is always between 1 and 10, and the second element is an arbitrary string. |
| |
| #### TupleOf |
| |
| The `TupleOf` domain combinator works just like the above `PairOf`. For example, |
| the domain: |
| |
| ```c++ |
| auto MyTupleDomain() { |
| return TupleOf(InRange(0, 10), |
| InRange(0, 10), |
| Arbitrary<std::string>()); |
| } |
| ``` |
| |
| represents values of type `std::tuple<int, int, std::string>`, with the |
| specified sub-domains. |
| |
| #### VariantOf |
| |
| The `VariantOf` domain combinator lets you define the domain for `variant` |
| types. For instance, the example domain below represents values of type |
| `std::variant<int, double, std::string>`, with the provided sub-domains. |
| |
| ```c++ |
| auto MyVariantDomain() { |
| return VariantOf(InRange(0, 10), |
| Arbitrary<double>(), |
| Arbitrary<std::string>()); |
| } |
| |
| ``` |
| |
| By default, `VariantOf` represents `std::variant` types, but it can also be used |
| to represent other variant types: |
| |
| ```c++ |
| auto MyAbslVariantDomain() { |
| return VariantOf<absl::variant<int, double>>(InRange(0, 10), |
| Arbitrary<double>(), |
| ``` |
| |
| #### OptionalOf |
| |
| The `OptionalOf` domain combinator lets you specify the sub-domain for value |
| type `T` for an `optional<T>` type. For instance, the domain: |
| |
| ```c++ |
| OptionalOf(InRange(0, 10)); |
| ``` |
| |
| represents values of type `std::optional<int>` of integers between 0 and 10. |
| Note that this domain includes `nullopt` as well. By default, the domain will |
| represent `std::optional`, but other optional types can be used as well: |
| |
| ```c++ |
| OptionalOf<absl::optional<int>>(InRange(0, 10)) |
| ``` |
| |
| To restrict the nullness of the domain, you can use `NullOpt` and `NonNull`: |
| |
| ```c++ |
| // Generates only null values. |
| NullOpt<int>() |
| // Generates optional<int> values that always contain an int value |
| // (i.e., it's never nullopt). |
| NonNull(OptionalOf(InRange(0, 10))) |
| ``` |
| |
| #### `SmartPointerOf`, `UniquePtrOf`, `SharedPtrOf` |
| |
| The `SmartPointerOf` domain combinator lets you specify a smart pointer `T` and |
| a subdomain to create its contents. For instance, the domain: |
| |
| ```c++ |
| SmartPointerOf<std::unique_ptr<int>>(InRange(0, 10)); |
| ``` |
| |
| represents values of type `std::unique_ptr<int>` of integers between 0 and 10. |
| Note that this domain includes `nullptr` as well. Shortcuts for |
| `std::unique_ptr` and `std::shared_ptr` exist in the form: |
| |
| ```c++ |
| UniquePtrOf(int_domain) == SmartPointerOf<std::unique_ptr<int>>(int_domain) |
| SharedPtrOf(int_domain) == SmartPointerOf<std::shared_ptr<int>>(int_domain) |
| ``` |
| |
| ### OneOf Combinators |
| |
| With the `OneOf` combinator we can merge multiple domains of the same type. For |
| example: |
| |
| ```c++ |
| auto PositiveOrMinusOne() { |
| return OneOf(Just(-1), Positive<int>()); |
| } |
| ``` |
| |
| The `Just` domain combinator simply wraps a constant into a domain, which is |
| necessary in this case, as OneOf only takes domains as arguments. |
| |
| Note that the list of domains must be known at compile time; unlike `ElementOf`, |
| you can't use a vector of domains. |
| |
| ### Map-ing Domains |
| |
| Often the best way to define a domain is using a mapping function. The `Map()` |
| domain combinator takes a mapping function and an arbitrary number of domains. |
| It uses the inner domains to generate values which are mapped using the passed |
| function. For example: |
| |
| ```c++ |
| auto AnyDurationString() { |
| auto any_int = Arbitrary<int>(); |
| auto suffixes = ElementOf<std::string>("s", "m", "h"); |
| return Map( |
| [](int i, const std::string& suffix) { return std::to_string(i) + suffix; }, |
| any_int, suffixes); |
| } |
| ``` |
| |
| ### FlatMap-ing Domains |
| |
| Sometimes, it is necessary to use the output of one domain as the input for |
| another domain. This can be accomplished with the `FlatMap()` function, which |
| is like `Map()`, but it takes a function which returns a `Domain`. For |
| example: |
| |
| ```c++ |
| auto AnyVectorOfFixedLengthStrings(int size) { |
| return VectorOf(Arbitrary<std::string>().WithSize(size)); |
| } |
| auto AnyVectorOfEqualSizedStrings() { |
| return FlatMap(AnyVectorOfFixedLengthStrings, /*size=*/ InRange(0, 10)); |
| } |
| ``` |
| |
| If `AnyVectorOfFixedLengthStrings()` had been passed to `Map()`, it would have |
| generated a `Domain<Domain<std::string>>`. `FlatMap()` "flattens" this to a |
| `Domain<std::string>`. |
| |
| ### Filter-ing Domains |
| |
| The `Filter` domain takes a domain and a predicate and returns a new domain that |
| uses the predicate to filter the generated values. |
| |
| ```c++ |
| auto NonZero() { |
| return Filter([](int x) { return x != 0; }, Arbitrary<int>()); |
| } |
| ``` |
| |
| Filtering through a domain is usually more efficient over filtering in the |
| property function, thus it is preferred. |
| |
| Important: Make sure that your filtering condition is not too restrictive. |
| Filtering simply drops values provided by the inner domain that don't match the |
| condition. So filters with very low yield would lead to ineffective fuzzing. |
| Therefore too restrictive filter functions will trigger an abort in the |
| framework. |
| |
| Unless you want filter just a few specific values (e.g., the NonZero example |
| above), consider if you can defined the domain with `Map()`-ing instead. For |
| instance, instead of: |
| |
| ```c++ {.bad} |
| auto EvenNumber() { |
| return Filter([](int i) { return i % 2 == 0; }, Arbitrary<int>()); |
| } |
| ``` |
| |
| you should use: |
| |
| ```c++ {.good} |
| auto EvenNumber() { |
| return Map([](int i) { return 2 * i; }, |
| // Ensure we don't try to produce a value that causes integer |
| // overflow; what happens next would be undefined behavior. |
| InRange(std::numeric_limits<int>::min() / 2, |
| std::numeric_limits<int>::max() / 2)); |
| } |
| ``` |
| |
| This leads to more efficient fuzzing, as no values will be dropped and no cycles |
| will be wasted. |
| |
| ### Recursive Domains |
| |
| Recursive data structures need recursive domains. We can use the `DomainBuilder` |
| to build such domains. Here are some examples: |
| |
| ```c++ |
| // Example 1: Self recursion. |
| struct Tree { |
| int value; |
| std::vector<Tree> children; |
| }; |
| |
| auto ArbitraryTree(){ |
| DomainBuilder builder; |
| builder.Set<Tree>( |
| "tree", StructOf<Tree>(InRange(0, 10), ContainerOf<std::vector<Tree>>( |
| builder.Get<Tree>("tree")))); |
| return std::move(builder).Finalize<Tree>("tree"); |
| } |
| |
| // Example 2: Loop recursion. |
| struct RedTree; |
| |
| struct BlackTree { |
| int value; |
| std::vector<RedTree> children; |
| }; |
| |
| struct RedTree { |
| int value; |
| std::vector<BlackTree> children; |
| }; |
| |
| auto ArbitraryRedBlackTree(){ |
| DomainBuilder builder; |
| builder.Set<RedTree>( |
| "redtree", StructOf<RedTree>(InRange(0, 10), |
| ContainerOf<std::vector<BlackTree>>( |
| builder.Get<BlackTree>("blacktree")))); |
| builder.Set<BlackTree>( |
| "blacktree", StructOf<BlackTree>(InRange(0, 10), |
| ContainerOf<std::vector<RedTree>>( |
| builder.Get<RedTree>("redtree")))); |
| |
| return std::move(builder).Finalize<RedTree>("redtree"); |
| } |
| ``` |
| |
| The builder maintains a set of sub-domains that comprise the domain. Every |
| domain in the builder is referenced by a name. The builder provides three |
| methods: `Get`, `Set`, and `Finalize`. `Get` returns a domain of the specified |
| type even if it hasn't been created. `Set` sets the final domain type of the |
| domain. |
| |
| When you have finished, call `Finalize` to get the domain ready for use. After |
| calling `Finalize`, the builder will be invalidated. |