oss-fuzz-gen

LLM powered fuzzing via OSS-Fuzz.

Science Score: 54.0%

This score indicates how likely this project is to be science-related based on various indicators:

✓
CITATION.cff file
Found CITATION.cff file
✓
codemeta.json file
Found codemeta.json file
✓
.zenodo.json file
Found .zenodo.json file
○
DOI references
○
Academic publication links
✓
Committers with academic emails
1 of 27 committers (3.7%) from academic institutions
○
Institutional organization owner
○
JOSS paper metadata
○
Scientific vocabulary similarity
Low similarity (15.8%) to scientific vocabulary

Keywords

ai fuzzing llm security

Keywords from Contributors

transformers cryptocurrencies sequences interactive network-simulation testing-tools hacking observability multi-agents application

Last synced: 6 months ago · JSON representation ·

Repository

LLM powered fuzzing via OSS-Fuzz.

Basic Info

Host: GitHub
Owner: google
License: apache-2.0
Language: Python
Default Branch: main
Homepage:
Size: 55.5 MB

Statistics

Stars: 1,274
Watchers: 17
Forks: 196
Open Issues: 152
Releases: 0

Topics

ai fuzzing llm security

Created about 2 years ago · Last pushed 6 months ago

Metadata Files

Readme Contributing License Citation

A Framework for Fuzz Target Generation and Evaluation

This framework generates fuzz targets for real-world C/C++/Java/Python projects with various Large Language Models (LLM) and benchmarks them via the OSS-Fuzz platform.

More details available in AI-Powered Fuzzing: Breaking the Bug Hunting Barrier: Alt text

Current supported models are: - Vertex AI code-bison - Vertex AI code-bison-32k - Gemini Pro - Gemini Ultra - Gemini Experimental - Gemini 1.5 - OpenAI GPT-3.5-turbo - OpenAI GPT-4 - OpenAI GPT-4o - OpenAI GPT-4o-mini - OpenAI GPT-4-turbo - OpenAI GPT-3.5-turbo (Azure) - OpenAI GPT-4 (Azure) - OpenAI GPT-4o (Azure)

Generated fuzz targets are evaluated with four metrics against the most up-to-date data from production environment: - Compilability - Runtime crashes - Runtime coverage - Runtime line coverage diff against existing human-written fuzz targets in OSS-Fuzz.

Here is a sample experiment result from 2024 Jan 31. The experiment included 1300+ benchmarks from 297 open-source projects.

Overall, this framework manages to successfully leverage LLMs to generate valid fuzz targets (which generate non-zero coverage increase) for 160 C/C++ projects. The maximum line coverage increase is 29% from the existing human-written targets.

Note that these reports are not public as they may contain undisclosed vulnerabilities.

Usage

Check our detailed usage guide for instructions on how to run this framework and generate reports based on the results.

Independent Agent Execution and Evaluation

You can also execute or evaluate individual agents without running full experiments, using the integrated agent execution framework. See the framework's documentation for detailed instructions on how to run individual agents or sequence of agents.

Collaborations

Interested in research or open-source community collaborations? Please feel free to create an issue or email us: oss-fuzz-team@google.com.

Bugs Discovered

So far, we have reported 30 new bugs/vulnerabilities found by automatically generated targets built by this framework: | Project | Bug | LLM | Prompt Builder | Target oracle | | ------- | --------- | --------- | --------------- | ------- | | cJSON | OOB read | Vertex AI | Default | Far reach, low coverage | | libplist | OOB read | Vertex AI | Default | Far reach, low coverage | | hunspell | OOB read | Vertex AI | default | Far reach, low coverage | | zstd | OOB write | Vertex AI | default | Far reach, low coverage | | gdbm | Stack buffer underflow | Vertex AI | default | Far reach, low coverage | | hoextdown | Use of uninitialised memory | Vertex AI | default | Far reach, low coverage | | pjsip | OOB read | Vertex AI | Default | Low coverage with fuzz keyword + easy params far reach | | pjsip | OOB read | Vertex AI | Default | Low coverage with fuzz keyword + easy params far reach | | gpac | OOB read | Vertex AI | Default | Low coverage with fuzz keyword + easy params far reach | | gpac | OOB read/write | Vertex AI | Default | All | | gpac | OOB read | Vertex AI | Default | All | | gpac | OOB read | Vertex AI | Default | All | | sqlite3 | OOB read | Vertex AI | Default | All | | htslib | OOB read | Vertex AI | Default | All | | libical | OOB read | Vertex AI | Default | All | | croaring | OOB read | Vertex AI | Test-to-harness | All | | openssl | CVE-2024-9143 - OOB read/write | Vertex AI | Default | All | | liblouis | Use of uninitialised memory | Vertex AI | Test-to-harness | Test identifier | | libucl | OOB read | Vertex AI | Default | Low coverage with fuzz keyword + easy params far reach | | openbabel | Use after free | Vertex AI | Default | Low coverage with fuzz keyword + easy params far reach | | libyang | OOB read | Vertex AI | Default | All | | openbabel | OOB read | Vertex AI | Default | All | | exiv2 | OOB read | Vertex AI | Default | All | | Undisclosed | Java RCE (pending maintainer triage) | Vertex AI | Default | Far reach, low coverage | | Undisclosed | Regexp DoS (pending maintainer triage) | Vertex AI | Default | Far reach, low coverage | | Undisclosed | OOB read | Vertex AI | Default | All | | Undisclosed | OOB write | Vertex AI | Default | All | | Undisclosed | OOB read | Vertex AI | Default | All | | Undisclosed | OOB read | Vertex AI | Default | All | | Undisclosed | Use after free | Vertex AI | Agent prompt | All |

These bugs could only have been discovered with newly generated targets. They were not reachable with existing OSS-Fuzz targets.

Current top coverage improvements by project

| Project | Total coverage gain | Total relative gain | OSS-Fuzz-gen total covered lines | OSS-Fuzz-gen new covered lines | Existing covered lines | Total project lines | | --------| ------------------- | ------------------- | -------------------------------- | ------------------------------ | ---------------------- | ------------------- | | phmap | 98.42% | 205.75% | 1601 | 1181 | 574 | 1120 | | usbguard | 97.62% | 26.04% | 24550 | 5463 | 20979 | 3564 | | onednn | 96.67% | 7057.14% | 5434 | 5434 | 77 | 210 | | avahi | 82.06% | 155.90% | 3358 | 2814 | 1805 | 3046 | | pugixml | 72.98% | 194.95% | 9015 | 6646 | 3409 | 7662 | | librdkafka | 66.88% | 845.57% | 5019 | 4490 | 531 | 1169 | | casync | 66.75% | 903.23% | 1171 | 1120 | 124 | 1678 | | tomlplusplus | 61.06% | 331.10% | 4755 | 3652 | 1103 | 5981 | | astc-encoder | 59.35% | 177.88% | 2726 | 1745 | 981 | 2940 | | mruby | 48.56% | 0.00% | 34493 | 34493 | 0 | 71038 | | arduinojson | 42.10% | 85.80% | 3344 | 1800 | 2098 | 4276 | | json | 41.13% | 66.51% | 5051 | 3339 | 5020 | 8119 | | double-conversion | 40.40% | 88.12% | 1663 | 779 | 884 | 1928 | | tinyobjloader | 38.26% | 77.01% | 1157 | 717 | 931 | 1874 | | glog | 38.18% | 58.69% | 895 | 331 | 564 | 867 | | cppitertools | 35.78% | 45.07% | 253 | 151 | 335 | 422 | | eigen | 35.38% | 190.70% | 2643 | 1947 | 1021 | 5503 | | glaze | 34.55% | 30.06% | 2920 | 2416 | 8036 | 6993 | | rapidjson | 31.83% | 148.07% | 1585 | 958 | 647 | 3010 | | libunwind | 30.58% | 83.25% | 2899 | 1342 | 1612 | 4388 | | openh264 | 30.07% | 50.14% | 6607 | 5751 | 11470 | 19123 |

* "Total project lines" measures the source code of the project-under-test compiled and linked by the preexisting human-written fuzz targets from OSS-Fuzz.

* "Total coverage gain" is calculated using a denominator of the "Total project lines". "Total relative gain" is the increase in coverage compared to the old number of covered lines.

* Additional code from the project-under-test maybe included when compiling the new fuzz targets and result in high percentage gains.

Citing This Work

Please click on the 'Cite this repository' button located on the right-hand side of this GitHub page for citation details.

Owner

Name: Google
Login: google
Kind: organization
Email: opensource@google.com
Location: United States of America

Website: https://opensource.google/
Twitter: GoogleOSS
Repositories: 2,773
Profile: https://github.com/google

Google ❤️ Open Source

Citation (CITATION.cff)

cff-version: 1.2.0
title: 'OSS-Fuzz-Gen: Automated Fuzz Target Generation'
message: >-
  If you use this software, please cite it using the
  metadata from this file.
type: software
authors:
  - given-names: Dongge
    family-names: Liu
    email: donggeliu@google.com
    affiliation: Google LLC
    orcid: 'https://orcid.org/0000-0003-4821-7033'
  - given-names: Oliver
    family-names: Chang
    email: ochang@google.com
    affiliation: Google LLC
    orcid: 'https://orcid.org/0009-0006-3181-4551'
  - given-names: Jonathan
    family-names: metzman
    email: metzman@google.com
    affiliation: Google LLC
    orcid: 'https://orcid.org/0000-0002-7042-0444'
  - given-names: Martin
    family-names: Sablotny
    email: msablotny@nvidia.com
    affiliation: NVIDIA
    orcid: 'https://orcid.org/0000-0002-9836-8254'
  - given-names: Mihai
    family-names: Maruseac
    email: mihaimaruseac@google.com
    affiliation: Google LLC
    orcid: 'https://orcid.org/0000-0002-6225-1206'
repository-code: 'https://github.com/google/oss-fuzz-gen'
url: >-
  https://security.googleblog.com/2023/08/ai-powered-fuzzing-breaking-bug-hunting.html
abstract: >-
  OSS-Fuzz-Gen, an innovative open-source project developed
  by Google, automates fuzz target generation to enhance
  software security and reliability. Utilizing advanced
  techniques, including large language models (LLM), static
  code analysis, and runtime crash diagnosis, this project
  efficiently creates and optimizes fuzz targets. These
  efforts increase code coverage and identify
  vulnerabilities within open-source projects. We actively
  encourage and support collaborations with the research and
  open-source communities, offering our services at no cost.
keywords:
  - Fuzzing
  - Fuzz target generation
  - Large Language Models
  - Open-source
  - Code analysis
  - Software security
license: Apache-2.0
version: 'https://github.com/google/oss-fuzz-gen/tree/v1.0'
date-released: '2024-05-02'

Committers

Last synced: 6 months ago

All Time

Total Commits: 742
Total Committers: 27
Avg Commits per committer: 27.481
Development Distribution Score (DDS): 0.612

Past Year

Commits: 420
Committers: 17
Avg Commits per committer: 24.706
Development Distribution Score (DDS): 0.529

Top Committers

Name	Email	Commits
DavidKorczynski	d**d@a**m	288
Dongge Liu	d**u@g**m	158
Arthur Chan	a**n@a**m	123
Oliver Chang	o****g	58
dependabot[bot]	4****]	20
cjx10	c**4@g**m	19
AmPaschal	3****l	14
Erfan	e**o@g**m	11
Abhishek Arya	a**a@g**m	7
Maoyi Xie	m**1@e**g	7
Myan V.	1****s	5
MarkLee131	1**6@q**m	5
trashvisor	6****r	4
fdt622	m**2@g**m	3
jonathanmetzman	3****n	3
Zewei Wang	v**6@g**m	3
Kartikay Singh pundir	1****7	2
Mihai Maruseac	m**c@g**m	2
Mark Teffeteller	m**r@g**m	2
vwvw	v****w	1
eric k	e**5@g**m	1
chyun	7****n	1
Scott Brenner	s**t@s**e	1
Rex P	1****x	1
Jack Lin	c**1@g**m	1
Ikko Eltociear Ashimine	e**r@g**m	1
Ayush Bhardwaj	9****h	1

Committer Domains (Top 20 + Academic)

google.com: 5 adalogics.com: 2 scottbrenner.me: 1 qq.com: 1 e.ntu.edu.sg: 1

Issues and Pull Requests

Last synced: 6 months ago

All Time

Total issues: 133
Total pull requests: 954
Average time to close issues: 15 days
Average time to close pull requests: 4 days
Total issue authors: 27
Total pull request authors: 55
Average comments per issue: 0.55
Average comments per pull request: 1.95
Merged pull requests: 650
Bot issues: 0
Bot pull requests: 27

Past Year

Issues: 47
Pull requests: 471
Average time to close issues: 17 days
Average time to close pull requests: 2 days
Issue authors: 19
Pull request authors: 41
Average comments per issue: 0.4
Average comments per pull request: 1.82
Merged pull requests: 299
Bot issues: 0
Bot pull requests: 12

View more stats

Top Authors

Issue Authors

DonggeLiu (56)
oliverchang (32)
DavidKorczynski (12)
Ekam219 (3)
Kartikayy007 (3)
myanvoos (3)
erfanio (2)
abohoss (2)
fedecerno (2)
AkshataABhat (1)
inferno-chromium (1)
aaronatp (1)
Marsman1996 (1)
Py7600tyty (1)
MarkLee131 (1)

Pull Request Authors

DavidKorczynski (307)
DonggeLiu (211)
arthurscchan (132)
oliverchang (64)
cjx10 (32)
dependabot[bot] (27)
AmPaschal (22)
erfanio (17)
trashvisor (12)
myanvoos (11)
maoyixie (9)
happy-qop (8)
Kartikayy007 (8)
inferno-chromium (7)
MarkLee131 (7)

Top Labels

Issue Labels

infra (42) prompt-engineering (8) enhancement (4) bug (4) Report (4) Quality-of-Life (2) Experiment-only (1) training (1)

Pull Request Labels

dependencies (27) Experiment-only (22) github_actions (16) python (5) infra (4) bug (2)

Dependencies

Dockerfile docker

debian 12 build

requirements.in pypi

Flask ==2.3.2
PyYAML ==6.0
google-cloud-aiplatform ==1.39.0
google-cloud-storage ==2.9.0
openai ==0.27.8
pandas ==2.1.1
pylint ==2.17.5
pyright ==1.1.345
requests ==2.28.1
tiktoken ==0.5.1
yapf ==0.40.1

requirements.txt pypi

aiohttp ==3.9.1
aiosignal ==1.3.1
astroid ==2.15.8
attrs ==23.2.0
blinker ==1.7.0
cachetools ==5.3.2
certifi ==2023.11.17
charset-normalizer ==2.1.1
click ==8.1.7
dill ==0.3.7
flask ==2.3.2
frozenlist ==1.4.1
google-api-core ==2.15.0
google-auth ==2.26.2
google-cloud-aiplatform ==1.39.0
google-cloud-bigquery ==3.16.0
google-cloud-core ==2.4.1
google-cloud-resource-manager ==1.11.0
google-cloud-storage ==2.9.0
google-crc32c ==1.5.0
google-resumable-media ==2.7.0
googleapis-common-protos ==1.62.0
grpc-google-iam-v1 ==0.13.0
grpcio ==1.60.0
grpcio-status ==1.60.0
idna ==3.6
importlib-metadata ==7.0.1
isort ==5.13.2
itsdangerous ==2.1.2
jinja2 ==3.1.3
lazy-object-proxy ==1.10.0
markupsafe ==2.1.3
mccabe ==0.7.0
multidict ==6.0.4
nodeenv ==1.8.0
numpy ==1.26.3
openai ==0.27.8
packaging ==23.2
pandas ==2.1.1
platformdirs ==4.1.0
proto-plus ==1.23.0
protobuf ==4.25.2
pyasn1 ==0.5.1
pyasn1-modules ==0.3.0
pylint ==2.17.5
pyright ==1.1.345
python-dateutil ==2.8.2
pytz ==2023.3.post1
pyyaml ==6.0
regex ==2023.12.25
requests ==2.28.1
rsa ==4.9
shapely ==2.0.2
six ==1.16.0
tiktoken ==0.5.1
tomli ==2.0.1
tomlkit ==0.12.3
tqdm ==4.66.1
tzdata ==2023.4
urllib3 ==1.26.18
werkzeug ==3.0.1
wrapt ==1.16.0
yapf ==0.40.1
yarl ==1.9.4
zipp ==3.17.0

.github/workflows/lint.yaml actions

actions/checkout v3 composite
actions/setup-python v4 composite

.github/workflows/push-pr-to-gcloud.yml actions

actions/checkout v4 composite
google-github-actions/auth v2 composite
google-github-actions/setup-gcloud v2 composite

.github/workflows/push-to-gcloud.yml actions

actions/checkout v4 composite
google-github-actions/auth v2 composite
google-github-actions/setup-gcloud v2 composite

oss-fuzz-gen

Science Score: 54.0%

Keywords

Keywords from Contributors

Repository

Basic Info

Statistics

Topics

Metadata Files

README.md

A Framework for Fuzz Target Generation and Evaluation

Usage

Independent Agent Execution and Evaluation

Collaborations

Bugs Discovered

Current top coverage improvements by project

Citing This Work

Owner

Citation (CITATION.cff)

Committers

All Time

Past Year

Top Committers

Committer Domains (Top 20 + Academic)

Issues and Pull Requests

All Time

Past Year

Top Authors

Issue Authors

Pull Request Authors

Top Labels

Issue Labels

Pull Request Labels

Dependencies