https://github.com/awslabs/spindle

https://github.com/awslabs/spindle

Science Score: 26.0%

This score indicates how likely this project is to be science-related based on various indicators:

  • CITATION.cff file
  • codemeta.json file
    Found codemeta.json file
  • .zenodo.json file
    Found .zenodo.json file
  • DOI references
  • Academic publication links
  • Academic email domains
  • Institutional organization owner
  • JOSS paper metadata
  • Scientific vocabulary similarity
    Low similarity (8.6%) to scientific vocabulary
Last synced: 10 months ago · JSON representation

Repository

Basic Info
  • Host: GitHub
  • Owner: awslabs
  • License: apache-2.0
  • Language: Rust
  • Default Branch: main
  • Size: 73.2 KB
Statistics
  • Stars: 15
  • Watchers: 3
  • Forks: 3
  • Open Issues: 0
  • Releases: 0
Created about 1 year ago · Last pushed 11 months ago
Metadata Files
Readme Contributing License Code of conduct Security

README.md

Spindle

Crates.io docs.rs License: APACHE Downloads

Spindle is a simple and efficient expression and byte sequence generator to aid fuzz testing parsers and de-serializers. Spindle spins raw, untyped byte buffers into structured data.

Overview

Spindle's syntax lets users define the structure of generated data. This syntax compiles to Grammar, a state machine that can be arbitrarily traversed to produce structure-aware, matching expressions.

Spindle works with fuzzers such as cargo-fuzz or AFL because it is an extension of arbitrary; the traversal of the state machine is deterministically dependent on Unstructured.

Spindle is particularly useful for generating semi-correct and interesting inputs that attack edge cases of parsers and de-serializers, such as mixing familiar tokens in incorrect places or sprinkling in Unicode characters.

Spindle is developed and leveraged by AWS to fuzz test the parsers and de-serializers in their backend systems.

Examples

For more examples, see the examples folder.

```rust use spindle_lib::Grammar; use arbitrary::Unstructured;

let math: Grammar = r#" expr : u16 | paren | expr symbol expr ; paren : "(" expr symbol expr ")" ; symbol : r"-|+|*|÷" ; "#.parse().unwrap();

let mut wool = Unstructured::new(b"poiuytasdbvcxeygrey"); let yarn: String = math.expression(&mut wool, None).unwrap(); // (21359*39933))+13082-62216 `` The state machine traversal always starts at the first rule. In the example, -expris the first rule and evaluates to eitheru16,paren, or the concatenation ofexprandsymbolandexpr. -;delimits different rules. -u16is a pre-defined rule that directly evaluates tou16::arbitrary(u). -parenevaluates to the concatenation of the literal"(",expr,symbol,expr, and")". -symbolevaluates to any arbitrary string that matches the regex-|+|*|÷`.

Semi-Correct Expression

This grammar is similar to the well formed math expression grammar, but sometimes includes an extra closing parenthesis and/or an arbitrary symbol.

```rust use spindle_lib::Grammar; use arbitrary::Unstructured;

let math: Grammar = r#" expr : u16 | paren | expr symbol expr ; paren : "(" expr symbol expr ")" ")"? ; symbol : r"-|+|*|÷" | String ; "#.parse().unwrap();

let mut wool = Unstructured::new(b"poiuytasdbvcxeygrey"); let yarn: String = math.expression(&mut wool, None).unwrap(); // (44637*32200)Ѱ'x124928390-27338) ```

Usage in Fuzzer

```rust,ignore use spindlelib::Grammar; use libfuzzersys::fuzz_target; use arbitrary::{Arbitrary, Result, Unstructured}; use std::sync::LazyLock;

static GRAMMAR: LazyLock = LazyLock::new(|| { r#" expr : u16 | paren | expr symbol expr ; paren : "(" expr symbol expr ")" ")"? ; symbol : r"-|+|*|÷" | String ; "#.parse().unwrap() });

struct MathExpression(String);

impl<'a> Arbitrary<'a> for MathExpression { fn arbitrary(u: &mut Unstructured<'a>) -> Result { Ok(Self(GRAMMAR.expression(u, None)?)) } }

fuzztarget!(|expr: MathExpression| { // ... myparser(expr); }); ```

Samples

text 6705d81051237= ♣69382149-12901+8851÷50*3993043534 (8198942155÷60177552446447) (586643-96)*036074789 (8÷68){K2628 (5798)) (0868430}ݾ▼73) 0135259 (930-6*9502) 5045620÷91599

Grammar Syntax

For examples see examples.

| Syntax | Description | | ----------- | ----------- | | rule : X ; | Defines a rule with name "rule" with some pattern X. "rule" can be referenced in the same grammar, e.g. another_rule : rule+ ; | | X? | Evaluates to either X or nothing. | | X+ | Evaluates to X 1 or more times (up to and including [crate::MAX_REPEAT]) | | X* | Evaluates to X 0 or more times (up to and including [crate::MAX_REPEAT]) | | X{k} | Evaluates to X exactly k times, where k is a u32. | | X{min,max} | Evaluates X at least min times and at most (including) max times. min and max are u32. | | X \| Y | Evaluates to either X or Y. | | "X" | Literal value inside the quotes, e.g. "foo" | | [X] | Literal Vec<u8>, e.g. [1, 2]. | | r"X" | Arbitrarily evaluates the regex inside the quotes, e.g. r"[A-Z]+". | | X Y | Evaluates to X and then Y. | | (X) | Groups the expression inside the parenthesis, e.g. (X \| Y)+. | | X @ w \| Y @ v | Evaluates to either X or Y based off probabilities given by weights w and v respectively, where w and v are u16. | u16, String, etc | A pre-defined type that evaluates to T::arbitrary(u). See more. Supported pre-defined rules are String, char, f32, f64, and signed + unsigned integer types. |

Visitor

A [Visitor] is state that is initialized before traversal and mutated as different rules are visited during the traversal, e.g. visit_or. Visitors that are already implemented are String and Vec<u8> for output buffers, and u64 for classification.

Users can implement their own Visitor to - use a different output buffer - use a different classification - gather data - build an abstract syntax tree

Example

```rust use spindle_lib::{Grammar, Visitor};

let math: Grammar = r#" expr : u16 | paren | expr symbol expr ; paren : "(" expr symbol expr ")" ; symbol : r"-|+|*|÷" ; "#.parse().unwrap();

/// Detects if any prime numbers were generated.

[derive(Debug, Default)]

struct PrimeDetector(bool);

impl Visitor for PrimeDetector { fn new() -> Self { Self::default() }

fn visit_u16(&mut self, num: u16) {
    let num_is_prime = (2..num).all(|a| num % a != 0);
    self.0 |= num_is_prime;
}

}

let mut wool = arbitrary::Unstructured::new(b"qwerty"); let (expr, anyprimes): (String, PrimeDetector) = math.expression(&mut wool, None).unwrap(); let yarn: String = math.expression(&mut wool, None).unwrap(); assert!(anyprimes.0); ```

Example

In this example, a math specific abstract syntax tree (AST) is built during the arbitrary traversal.

```rust use spindle_lib::{Grammar, Visitor};

let math: Grammar = r#" expr : u16 | paren | expr symbol expr ; paren : "(" expr symbol expr ")" ; symbol : r"-|+|*|÷" ; "#.parse().unwrap();

[derive(Debug, Default)]

struct MathAst { cur_op: Option, stack: Vec, }

[derive(Debug)]

enum MathAstNode { Num(u16), Expr(Box, char, Box) }

impl Visitor for MathAst { fn new() -> Self { Self::default() }

fn visit_regex(&mut self, regex_output: &[u8]) {
    assert_eq!(regex_output.len(), 1);
    let c = regex_output[0] as char;
    self.cur_op = Some(c);
}

fn visit_u16(&mut self, num: u16) {
    match self.cur_op {
        None => self.stack.push(MathAstNode::Num(num)),
        Some(symbol) => {
            let left = Box::new(self.stack.pop().unwrap());
            let right = Box::new(MathAstNode::Num(num));
            let new = MathAstNode::Expr(left, symbol, right);
            self.stack.push(new);
            self.cur_op = None;
        }
    }
}

}

let mut wool = arbitrary::Unstructured::new(b"494392"); // MathAst { cur_op: None, stack: [Expr(Num(13108), '*', Num(0))] } let yarn: MathAst = math.expression(&mut wool, None).unwrap(); ```

Owner

  • Name: Amazon Web Services - Labs
  • Login: awslabs
  • Kind: organization
  • Location: Seattle, WA

AWS Labs

GitHub Events

Total
  • Watch event: 12
  • Delete event: 4
  • Push event: 14
  • Public event: 1
  • Pull request review comment event: 17
  • Pull request review event: 21
  • Pull request event: 11
  • Fork event: 1
  • Create event: 5
Last Year
  • Watch event: 12
  • Delete event: 4
  • Push event: 14
  • Public event: 1
  • Pull request review comment event: 17
  • Pull request review event: 21
  • Pull request event: 11
  • Fork event: 1
  • Create event: 5

Packages

  • Total packages: 1
  • Total downloads:
    • cargo 11,270 total
  • Total dependent packages: 0
  • Total dependent repositories: 0
  • Total versions: 3
  • Total maintainers: 1
crates.io: spindle-lib

Simple and efficient expression and byte sequence generator for fuzz testing.

  • Versions: 3
  • Dependent Packages: 0
  • Dependent Repositories: 0
  • Downloads: 11,270 Total
Rankings
Dependent repos count: 21.9%
Dependent packages count: 29.0%
Average: 48.6%
Downloads: 94.8%
Maintainers (1)
Last synced: 10 months ago

Dependencies

.github/workflows/rust.yml actions
  • actions/checkout v4 composite
Cargo.toml cargo
  • rand 0.9.0 development
  • regex 1.11 development
  • arbitrary 1.4.1
  • enum-iterator 2.1.0
  • fxhash 0.2.1
  • itoa 1.0.11
  • peg 0.8.5
  • regex-syntax 0.8.5