honestcybereval
An honest eval of the cyber risk of AI using a real cyber task (vs proxy tasks).
Science Score: 72.0%
This score indicates how likely this project is to be science-related based on various indicators:
-
✓CITATION.cff file
Found CITATION.cff file -
✓codemeta.json file
Found codemeta.json file -
✓.zenodo.json file
Found .zenodo.json file -
○DOI references
-
✓Academic publication links
Links to: arxiv.org -
✓Committers with academic emails
2 of 4 committers (50.0%) from academic institutions -
✓Institutional organization owner
Organization alan-turing-institute has institutional domain (turing.ac.uk) -
○JOSS paper metadata
-
○Scientific vocabulary similarity
Low similarity (14.2%) to scientific vocabulary
Repository
An honest eval of the cyber risk of AI using a real cyber task (vs proxy tasks).
Basic Info
Statistics
- Stars: 2
- Watchers: 1
- Forks: 0
- Open Issues: 0
- Releases: 0
Metadata Files
README.md
HonestCyberEval
HonestCyberEval assesses AI models’ capabilities and risks in automated software exploitation, focusing on their ability to detect and exploit vulnerabilities in real-world software systems. It models an end-to-end cyber real task and requires from the model to provide structured inputs to trigger known sanitizers (i.e., proof of exploitation). We currently leverage the Nginx web server repository augmented with synthetic vulnerabilities (AIxCC) and we are expanding the projects and number of synthetic vulnerabilities we use for our assessment.
Task
- exploit.py - tasks the model with generating input that triggers a specific vulnerability in a reflective loop that stops if the model generates the correct input; runs over multiple epochs but skips subsequent epochs when a correct generation is produced.
- identify.py - tasks the model with identifying which of a set of vulnerabilities is present in the code using multiple choice prompting.
- paired.py - given both a vulnerable and a fixed version of the code, tasks the model with differentiating correctly between the two; based on Vulnerability Detection with Code Language Models: How Far Are We?, Ding et al., 2024.
Setup
Due to the requirements of the AIxCC repos, this works best on Linux.
- Install dependencies:
shell
sudo apt install make
-
- E.g.:
shell sudo snap install yq To avoid issues with address randomisation (more info), run:
shell
sudo sysctl vm.mmap_rnd_bits=28
echo "vm.mmap_rnd_bits=28" | sudo tee -a /etc/sysctl.conf
- Set up the environment variables and API keys:
- Rename the
.env.examplefile:
- Rename the
shell
cp .env.example .env
- Generate a new personal access token (PAT) (https://github.com/settings/tokens) with
read:packagespermissions. Fill in theGITHUB_USERandGITHUB_TOKENvalues. - Fill in API keys for the LLM(s) that are to be evaluated (
ANTHROPIC_API_KEY,AZURE_API_KEY,OPENAI_API_KEY).
Docker
The evaluation challenge projects are run inside Docker containers. If Docker is unavailable, installing it by following the documentation. Then, enable managing Docker as a non-root user.
To be able to pull Docker images for the challenge projects, log into ghcr.io using your PAT, run:
shell
echo "<token>" | docker login ghcr.io -u <user> --password-stdin
replacing <user> and <token> with your generated PAT.
Running the evaluation
First, configure which challenge project should be downloaded by (un)commenting the appropriate entries in
config/cp_config.yaml.
Run the make cps command to download the code and docker images associated with challenge projects defined in
cp_config.yaml. The code will be downloaded to cp_root.
Finally, run the evaluation using inspect eval exploit.py --model=<model> -T cp=<challenge project> -S max_iterations=<num> e.g.
For example:
shell
inspect eval exploit.py --model=openai/o1 -T cp=nginx-cp
will run the nginx-cp project with the default 5 reflexion loops and 5 epochs.
A successful attempt will terminate the current loop and skip future epochs.
The optional critique_model solver parameter allows a different model to be called for the critique component of the solver:
shell
inspect eval exploit.py --model=openai/o1-mini --solver=reflexion_vuln_detect -S critique_model=openai/o1
Future work
- Support challenge projects that expect input as bytes
- More tasks
Owner
- Name: The Alan Turing Institute
- Login: alan-turing-institute
- Kind: organization
- Email: info@turing.ac.uk
- Website: https://turing.ac.uk
- Repositories: 477
- Profile: https://github.com/alan-turing-institute
The UK's national institute for data science and artificial intelligence.
Citation (CITATION.cff)
cff-version: 1.2.0 message: "If you use this software, please cite it as below." authors: - family-names: "Ristea" given-names: "Dan" affiliation: "The Alan Turing Institute, UCL" - family-names: "Mavroudis" given-names: "Vasilios" affiliation: "The Alan Turing Institute" - family-names: "Hicks" given-names: "Chris" affiliation: "The Alan Turing Institute" title: "HonestCyberEval" version: 1.0.0 date-released: 2025-02-21 url: "https://github.com/alan-turing-institute/HonestCyberEval"
GitHub Events
Total
- Delete event: 2
- Push event: 7
- Public event: 1
- Pull request event: 2
- Create event: 1
Last Year
- Delete event: 2
- Push event: 7
- Public event: 1
- Pull request event: 2
- Create event: 1
Committers
Last synced: 8 months ago
Top Committers
| Name | Commits | |
|---|---|---|
| danrr | d****a@t****k | 3 |
| Dan Ristea | d****a@t****k | 2 |
| Dan Ristea | d****a@p****m | 2 |
| Vasilios Mavroudis | m****v | 1 |
Committer Domains (Top 20 + Academic)
Issues and Pull Requests
Last synced: 8 months ago
All Time
- Total issues: 0
- Total pull requests: 2
- Average time to close issues: N/A
- Average time to close pull requests: 23 days
- Total issue authors: 0
- Total pull request authors: 1
- Average comments per issue: 0
- Average comments per pull request: 0.0
- Merged pull requests: 2
- Bot issues: 0
- Bot pull requests: 0
Past Year
- Issues: 0
- Pull requests: 2
- Average time to close issues: N/A
- Average time to close pull requests: 23 days
- Issue authors: 0
- Pull request authors: 1
- Average comments per issue: 0
- Average comments per pull request: 0.0
- Merged pull requests: 2
- Bot issues: 0
- Bot pull requests: 0
Top Authors
Issue Authors
Pull Request Authors
- danrr (3)
Top Labels
Issue Labels
Pull Request Labels
Dependencies
- actions/checkout v4 composite
- actions/setup-python v5 composite
- jakebailey/pyright-action v2 composite
- python 3.12.7-slim-bookworm build
- GitPython ==3.1.43
- PyYAML ==6.0.2
- aiorwlock ==1.4.0
- aioshutil ==1.4
- inspect-ai ==0.3.69
- aiobotocore ==2.18.0
- aiohappyeyeballs ==2.4.4
- aiohttp ==3.11.11
- aioitertools ==0.12.0
- aiorwlock ==1.4.0
- aioshutil ==1.4
- aiosignal ==1.3.2
- annotated-types ==0.7.0
- anyio ==4.8.0
- attrs ==24.3.0
- beautifulsoup4 ==4.12.3
- botocore ==1.36.1
- certifi ==2024.12.14
- click ==8.1.8
- debugpy ==1.8.12
- docstring-parser ==0.16
- frozenlist ==1.5.0
- fsspec ==2024.12.0
- gitdb ==4.0.12
- gitpython ==3.1.43
- h11 ==0.14.0
- httpcore ==1.0.7
- httpx ==0.28.1
- idna ==3.10
- ijson ==3.3.0
- inspect-ai ==0.3.69
- jmespath ==1.0.1
- jsonlines ==4.0.0
- jsonpatch ==1.33
- jsonpointer ==3.0.0
- jsonschema ==4.23.0
- jsonschema-specifications ==2024.10.1
- linkify-it-py ==2.0.3
- markdown-it-py ==3.0.0
- mdit-py-plugins ==0.4.2
- mdurl ==0.1.2
- mmh3 ==5.0.1
- multidict ==6.1.0
- nest-asyncio ==1.6.0
- numpy ==1.26.4
- platformdirs ==4.3.6
- propcache ==0.2.1
- psutil ==6.1.1
- pydantic ==2.10.5
- pydantic-core ==2.27.2
- pygments ==2.19.1
- python-dateutil ==2.9.0.post0
- python-dotenv ==1.0.1
- pyyaml ==6.0.2
- referencing ==0.36.1
- rich ==13.9.4
- rpds-py ==0.22.3
- s3fs ==2024.12.0
- semver ==3.0.2
- shortuuid ==1.0.13
- six ==1.17.0
- smmap ==5.0.2
- sniffio ==1.3.1
- soupsieve ==2.6
- tenacity ==9.0.0
- textual ==1.0.0
- typing-extensions ==4.12.2
- uc-micro-py ==1.0.3
- urllib3 ==2.3.0
- wrapt ==1.17.2
- yarl ==1.18.3
- zipp ==3.21.0