honestcybereval

An honest eval of the cyber risk of AI using a real cyber task (vs proxy tasks).

https://github.com/alan-turing-institute/honestcybereval

Science Score: 72.0%

This score indicates how likely this project is to be science-related based on various indicators:

  • CITATION.cff file
    Found CITATION.cff file
  • codemeta.json file
    Found codemeta.json file
  • .zenodo.json file
    Found .zenodo.json file
  • DOI references
  • Academic publication links
    Links to: arxiv.org
  • Committers with academic emails
    2 of 4 committers (50.0%) from academic institutions
  • Institutional organization owner
    Organization alan-turing-institute has institutional domain (turing.ac.uk)
  • JOSS paper metadata
  • Scientific vocabulary similarity
    Low similarity (14.2%) to scientific vocabulary
Last synced: 6 months ago · JSON representation ·

Repository

An honest eval of the cyber risk of AI using a real cyber task (vs proxy tasks).

Basic Info
  • Host: GitHub
  • Owner: alan-turing-institute
  • License: mit
  • Language: Python
  • Default Branch: main
  • Homepage:
  • Size: 783 KB
Statistics
  • Stars: 2
  • Watchers: 1
  • Forks: 0
  • Open Issues: 0
  • Releases: 0
Created about 1 year ago · Last pushed 7 months ago
Metadata Files
Readme Contributing License Citation

README.md

HonestCyberEval

HonestCyberEval assesses AI models’ capabilities and risks in automated software exploitation, focusing on their ability to detect and exploit vulnerabilities in real-world software systems. It models an end-to-end cyber real task and requires from the model to provide structured inputs to trigger known sanitizers (i.e., proof of exploitation). We currently leverage the Nginx web server repository augmented with synthetic vulnerabilities (AIxCC) and we are expanding the projects and number of synthetic vulnerabilities we use for our assessment.

Task

  • exploit.py - tasks the model with generating input that triggers a specific vulnerability in a reflective loop that stops if the model generates the correct input; runs over multiple epochs but skips subsequent epochs when a correct generation is produced.
  • identify.py - tasks the model with identifying which of a set of vulnerabilities is present in the code using multiple choice prompting.
  • paired.py - given both a vulnerable and a fixed version of the code, tasks the model with differentiating correctly between the two; based on Vulnerability Detection with Code Language Models: How Far Are We?, Ding et al., 2024.

Setup

Due to the requirements of the AIxCC repos, this works best on Linux.

  • Install dependencies:

shell sudo apt install make

  • Install yq

    • E.g.:

    shell sudo snap install yq

  • To avoid issues with address randomisation (more info), run:

shell sudo sysctl vm.mmap_rnd_bits=28 echo "vm.mmap_rnd_bits=28" | sudo tee -a /etc/sysctl.conf

  • Set up the environment variables and API keys:
    • Rename the .env.example file:

shell cp .env.example .env

  • Generate a new personal access token (PAT) (https://github.com/settings/tokens) with read:packages permissions. Fill in the GITHUB_USER and GITHUB_TOKEN values.
  • Fill in API keys for the LLM(s) that are to be evaluated (ANTHROPIC_API_KEY, AZURE_API_KEY, OPENAI_API_KEY).

Docker

The evaluation challenge projects are run inside Docker containers. If Docker is unavailable, installing it by following the documentation. Then, enable managing Docker as a non-root user.

To be able to pull Docker images for the challenge projects, log into ghcr.io using your PAT, run:

shell echo "<token>" | docker login ghcr.io -u <user> --password-stdin

replacing <user> and <token> with your generated PAT.

Running the evaluation

First, configure which challenge project should be downloaded by (un)commenting the appropriate entries in config/cp_config.yaml.

Run the make cps command to download the code and docker images associated with challenge projects defined in cp_config.yaml. The code will be downloaded to cp_root.

Finally, run the evaluation using inspect eval exploit.py --model=<model> -T cp=<challenge project> -S max_iterations=<num> e.g.

For example:

shell inspect eval exploit.py --model=openai/o1 -T cp=nginx-cp

will run the nginx-cp project with the default 5 reflexion loops and 5 epochs. A successful attempt will terminate the current loop and skip future epochs.

The optional critique_model solver parameter allows a different model to be called for the critique component of the solver:

shell inspect eval exploit.py --model=openai/o1-mini --solver=reflexion_vuln_detect -S critique_model=openai/o1

Future work

  • Support challenge projects that expect input as bytes
  • More tasks

Owner

  • Name: The Alan Turing Institute
  • Login: alan-turing-institute
  • Kind: organization
  • Email: info@turing.ac.uk

The UK's national institute for data science and artificial intelligence.

Citation (CITATION.cff)

cff-version: 1.2.0
message: "If you use this software, please cite it as below."
authors:
- family-names: "Ristea"
  given-names: "Dan"
  affiliation: "The Alan Turing Institute, UCL"
- family-names: "Mavroudis"
  given-names: "Vasilios"
  affiliation: "The Alan Turing Institute"
- family-names: "Hicks"
  given-names: "Chris"
  affiliation: "The Alan Turing Institute"
title: "HonestCyberEval"
version: 1.0.0
date-released: 2025-02-21
url: "https://github.com/alan-turing-institute/HonestCyberEval"

GitHub Events

Total
  • Delete event: 2
  • Push event: 7
  • Public event: 1
  • Pull request event: 2
  • Create event: 1
Last Year
  • Delete event: 2
  • Push event: 7
  • Public event: 1
  • Pull request event: 2
  • Create event: 1

Committers

Last synced: 8 months ago

All Time
  • Total Commits: 8
  • Total Committers: 4
  • Avg Commits per committer: 2.0
  • Development Distribution Score (DDS): 0.625
Past Year
  • Commits: 8
  • Committers: 4
  • Avg Commits per committer: 2.0
  • Development Distribution Score (DDS): 0.625
Top Committers
Name Email Commits
danrr d****a@t****k 3
Dan Ristea d****a@t****k 2
Dan Ristea d****a@p****m 2
Vasilios Mavroudis m****v 1
Committer Domains (Top 20 + Academic)

Issues and Pull Requests

Last synced: 8 months ago

All Time
  • Total issues: 0
  • Total pull requests: 2
  • Average time to close issues: N/A
  • Average time to close pull requests: 23 days
  • Total issue authors: 0
  • Total pull request authors: 1
  • Average comments per issue: 0
  • Average comments per pull request: 0.0
  • Merged pull requests: 2
  • Bot issues: 0
  • Bot pull requests: 0
Past Year
  • Issues: 0
  • Pull requests: 2
  • Average time to close issues: N/A
  • Average time to close pull requests: 23 days
  • Issue authors: 0
  • Pull request authors: 1
  • Average comments per issue: 0
  • Average comments per pull request: 0.0
  • Merged pull requests: 2
  • Bot issues: 0
  • Bot pull requests: 0
Top Authors
Issue Authors
Pull Request Authors
  • danrr (3)
Top Labels
Issue Labels
Pull Request Labels

Dependencies

.github/workflows/crs.yml actions
  • actions/checkout v4 composite
  • actions/setup-python v5 composite
  • jakebailey/pyright-action v2 composite
Dockerfile docker
  • python 3.12.7-slim-bookworm build
pyproject.toml pypi
  • GitPython ==3.1.43
  • PyYAML ==6.0.2
  • aiorwlock ==1.4.0
  • aioshutil ==1.4
  • inspect-ai ==0.3.69
requirements.txt pypi
  • aiobotocore ==2.18.0
  • aiohappyeyeballs ==2.4.4
  • aiohttp ==3.11.11
  • aioitertools ==0.12.0
  • aiorwlock ==1.4.0
  • aioshutil ==1.4
  • aiosignal ==1.3.2
  • annotated-types ==0.7.0
  • anyio ==4.8.0
  • attrs ==24.3.0
  • beautifulsoup4 ==4.12.3
  • botocore ==1.36.1
  • certifi ==2024.12.14
  • click ==8.1.8
  • debugpy ==1.8.12
  • docstring-parser ==0.16
  • frozenlist ==1.5.0
  • fsspec ==2024.12.0
  • gitdb ==4.0.12
  • gitpython ==3.1.43
  • h11 ==0.14.0
  • httpcore ==1.0.7
  • httpx ==0.28.1
  • idna ==3.10
  • ijson ==3.3.0
  • inspect-ai ==0.3.69
  • jmespath ==1.0.1
  • jsonlines ==4.0.0
  • jsonpatch ==1.33
  • jsonpointer ==3.0.0
  • jsonschema ==4.23.0
  • jsonschema-specifications ==2024.10.1
  • linkify-it-py ==2.0.3
  • markdown-it-py ==3.0.0
  • mdit-py-plugins ==0.4.2
  • mdurl ==0.1.2
  • mmh3 ==5.0.1
  • multidict ==6.1.0
  • nest-asyncio ==1.6.0
  • numpy ==1.26.4
  • platformdirs ==4.3.6
  • propcache ==0.2.1
  • psutil ==6.1.1
  • pydantic ==2.10.5
  • pydantic-core ==2.27.2
  • pygments ==2.19.1
  • python-dateutil ==2.9.0.post0
  • python-dotenv ==1.0.1
  • pyyaml ==6.0.2
  • referencing ==0.36.1
  • rich ==13.9.4
  • rpds-py ==0.22.3
  • s3fs ==2024.12.0
  • semver ==3.0.2
  • shortuuid ==1.0.13
  • six ==1.17.0
  • smmap ==5.0.2
  • sniffio ==1.3.1
  • soupsieve ==2.6
  • tenacity ==9.0.0
  • textual ==1.0.0
  • typing-extensions ==4.12.2
  • uc-micro-py ==1.0.3
  • urllib3 ==2.3.0
  • wrapt ==1.17.2
  • yarl ==1.18.3
  • zipp ==3.21.0
src/solvers/setup.py pypi