pub trait Arbitrary<'a>: Sized {
fn arbitrary(u: &mut Unstructured<'a>) -> Result<Self>;
fn arbitrary_take_rest(u: Unstructured<'a>) -> Result<Self> { ... }
fn size_hint(depth: usize) -> (usize, Option<usize>) { ... }
}
Expand description
Generate arbitrary structured values from raw, unstructured data.
The Arbitrary
trait allows you to generate valid structured values, like
HashMap
s, or ASTs, or MyTomlConfig
, or any other data structure from
raw, unstructured bytes provided by a fuzzer.
Deriving Arbitrary
Automatically deriving the Arbitrary
trait is the recommended way to
implement Arbitrary
for your types.
Using the custom derive requires that you enable the "derive"
cargo
feature in your Cargo.toml
:
[dependencies]
arbitrary = { version = "1", features = ["derive"] }
Then, you add the #[derive(Arbitrary)]
annotation to your struct
or
enum
type definition:
use arbitrary::Arbitrary;
use std::collections::HashSet;
#[derive(Arbitrary)]
pub struct AddressBook {
friends: HashSet<Friend>,
}
#[derive(Arbitrary, Hash, Eq, PartialEq)]
pub enum Friend {
Buddy { name: String },
Pal { age: usize },
}
Every member of the struct
or enum
must also implement Arbitrary
.
Implementing Arbitrary
By Hand
Implementing Arbitrary
mostly involves nested calls to other Arbitrary
arbitrary implementations for each of your struct
or enum
’s members. But
sometimes you need some amount of raw data, or you need to generate a
variably-sized collection type, or something of that sort. The
Unstructured
type helps you with these tasks.
use arbitrary::{Arbitrary, Result, Unstructured};
impl<'a, T> Arbitrary<'a> for MyCollection<T>
where
T: Arbitrary<'a>,
{
fn arbitrary(u: &mut Unstructured<'a>) -> Result<Self> {
// Get an iterator of arbitrary `T`s.
let iter = u.arbitrary_iter::<T>()?;
// And then create a collection!
let mut my_collection = MyCollection::new();
for elem_result in iter {
let elem = elem_result?;
my_collection.insert(elem);
}
Ok(my_collection)
}
}
Required methods
fn arbitrary(u: &mut Unstructured<'a>) -> Result<Self>
fn arbitrary(u: &mut Unstructured<'a>) -> Result<Self>
Generate an arbitrary value of Self
from the given unstructured data.
Calling Arbitrary::arbitrary
requires that you have some raw data,
perhaps given to you by a fuzzer like AFL or libFuzzer. You wrap this
raw data in an Unstructured
, and then you can call <MyType as Arbitrary>::arbitrary
to construct an arbitrary instance of MyType
from that unstuctured data.
Implementation may return an error if there is not enough data to
construct a full instance of Self
. This is generally OK: it is better
to exit early and get the fuzzer to provide more input data, than it is
to generate default values in place of the missing data, which would
bias the distribution of generated values, and ultimately make fuzzing
less efficient.
use arbitrary::{Arbitrary, Unstructured};
#[derive(Arbitrary)]
pub struct MyType {
// ...
}
// Get the raw data from the fuzzer or wherever else.
let raw_data: &[u8] = get_raw_data_from_fuzzer();
// Wrap that raw data in an `Unstructured`.
let mut unstructured = Unstructured::new(raw_data);
// Generate an arbitrary instance of `MyType` and do stuff with it.
if let Ok(value) = MyType::arbitrary(&mut unstructured) {
do_stuff(value);
}
See also the documentation for Unstructured
.
Provided methods
fn arbitrary_take_rest(u: Unstructured<'a>) -> Result<Self>
fn arbitrary_take_rest(u: Unstructured<'a>) -> Result<Self>
Generate an arbitrary value of Self
from the entirety of the given unstructured data.
This is similar to Arbitrary::arbitrary, however it assumes that it is the
last consumer of the given data, and is thus able to consume it all if it needs.
See also the documentation for Unstructured
.
Get a size hint for how many bytes out of an Unstructured
this type
needs to construct itself.
This is useful for determining how many elements we should insert when creating an arbitrary collection.
The return value is similar to
Iterator::size_hint
: it returns a tuple where
the first element is a lower bound on the number of bytes required, and
the second element is an optional upper bound.
The default implementation return (0, None)
which is correct for any
type, but not ultimately that useful. Using #[derive(Arbitrary)]
will
create a better implementation. If you are writing an Arbitrary
implementation by hand, and your type can be part of a dynamically sized
collection (such as Vec
), you are strongly encouraged to override this
default with a better implementation. The
size_hint
module will help with this task.
The depth
Parameter
If you 100% know that the type you are implementing Arbitrary
for is
not a recursive type, or your implementation is not transitively calling
any other size_hint
methods, you can ignore the depth
parameter.
Note that if you are implementing Arbitrary
for a generic type, you
cannot guarantee the lack of type recursion!
Otherwise, you need to use
arbitrary::size_hint::recursion_guard(depth)
to prevent potential infinite recursion when calculating size hints for
potentially recursive types:
use arbitrary::{Arbitrary, Unstructured, size_hint};
// This can potentially be a recursive type if `L` or `R` contain
// something like `Box<Option<MyEither<L, R>>>`!
enum MyEither<L, R> {
Left(L),
Right(R),
}
impl<'a, L, R> Arbitrary<'a> for MyEither<L, R>
where
L: Arbitrary<'a>,
R: Arbitrary<'a>,
{
fn arbitrary(u: &mut Unstructured) -> arbitrary::Result<Self> {
// ...
}
fn size_hint(depth: usize) -> (usize, Option<usize>) {
// Protect against potential infinite recursion with
// `recursion_guard`.
size_hint::recursion_guard(depth, |depth| {
// If we aren't too deep, then `recursion_guard` calls
// this closure, which implements the natural size hint.
// Don't forget to use the new `depth` in all nested
// `size_hint` calls! We recommend shadowing the
// parameter, like what is done here, so that you can't
// accidentally use the wrong depth.
size_hint::or(
<L as Arbitrary>::size_hint(depth),
<R as Arbitrary>::size_hint(depth),
)
})
}
}