doomarena

DoomArena is a Framework for Testing AI Agents Against Evolving Security Threats

https://github.com/servicenow/doomarena

Science Score: 36.0%

This score indicates how likely this project is to be science-related based on various indicators:

  • CITATION.cff file
  • codemeta.json file
    Found codemeta.json file
  • .zenodo.json file
    Found .zenodo.json file
  • DOI references
  • Academic publication links
    Links to: arxiv.org
  • Academic email domains
  • Institutional organization owner
  • JOSS paper metadata
  • Scientific vocabulary similarity
    Low similarity (13.5%) to scientific vocabulary

Keywords

ai ai-safety attack browsergym defense llm machine machine-learning red-teaming security taubench web-agents
Last synced: 6 months ago · JSON representation

Repository

DoomArena is a Framework for Testing AI Agents Against Evolving Security Threats

Basic Info
Statistics
  • Stars: 39
  • Watchers: 1
  • Forks: 4
  • Open Issues: 0
  • Releases: 0
Topics
ai ai-safety attack browsergym defense llm machine machine-learning red-teaming security taubench web-agents
Created 11 months ago · Last pushed 8 months ago
Metadata Files
Readme License Code of conduct Citation Security

README.md

DoomArena: A Framework for Testing AI Agents Against Evolving Security Threats

pypi PyPI - License PyPI - Downloads GitHub star chart

DoomArena is a modular, configurable, plug-in security testing framework for AI agents that supports many agentic frameworks including $\tau$-bench, Browsergym, OSWorld and TapeAgents (see Mail agent example). It enables testing agents in the face of adversarial attacks consistent with a given threat model, and supports several attacks (with the ability for users to add their own) and several threat models.

🚀 Quick Start

The DoomArena Intro Notebook is a good place for learning hands-on about the core concepts of DoomArena. You will implement an AttackGateway and a simple FixedInjectionAttack to alter the normal behavior of a simple flight searcher agent.

If you only want to use the library just run bash pip install doomarena # core library, minimal dependencies

If you want to run DoomArena integrated with TauBench, additionally run

bash pip install doomarena-taubench # optional

If you want to run DoomArena integrated with Browsergym, additionally run

bash pip install doomarena-browsergym # optional

If you want to test attacks on a Mail Agent (which can summarize and send emails on your behalf) inspired by the LLMail Challenge run bash pip install -e doomarena/mailinject # optional

If you want to run DoomArena integrated with OSWorld run pip install -e doomarena/osworld and follow our setup instructions here.

Export relevant API keys into your environment or .env file. bash OPENAI_API_KEY="<your api key>" OPENROUTER_API_KEY="<your api key>"

🛠️ Advanced Setup

To actively develop DoomArena, please create a virtual environment and install the package locally in editable mode using bash pip install -e doomarena/core pip install -e doomarena/taubench pip install -e doomarena/browsergym pip install -e doomarena/mailinject pip install -e doomarena/osworld

Once the environments are set up, run the tests to make sure everything is working. bash make ci-tests make tests # requires openai key

💻 Running Experiments

Follow the environment-specific instructions for TauBench and BrowserGym

🌟 Contributors

DoomArena contributors

Note: contributions made prior to the open-sourcing are not accounted for; please refer to author list for full list of contributors.

📝 Paper

If you found DoomArena helpful, please cite us @misc{boisvert2025doomarenaframeworktestingai, title={DoomArena: A framework for Testing AI Agents Against Evolving Security Threats}, author={Leo Boisvert and Mihir Bansal and Chandra Kiran Reddy Evuru and Gabriel Huang and Abhay Puri and Avinandan Bose and Maryam Fazel and Quentin Cappart and Jason Stanley and Alexandre Lacoste and Alexandre Drouin and Krishnamurthy Dvijotham}, year={2025}, eprint={2504.14064}, archivePrefix={arXiv}, primaryClass={cs.CR}, url={https://arxiv.org/abs/2504.14064}, }

Owner

  • Name: ServiceNow
  • Login: ServiceNow
  • Kind: organization

Works for you™

Citation (CITATION.cff)


      

GitHub Events

Total
  • Issues event: 3
  • Watch event: 36
  • Issue comment event: 1
  • Member event: 1
  • Push event: 65
  • Public event: 1
  • Pull request review event: 1
  • Pull request event: 20
  • Fork event: 2
  • Create event: 11
Last Year
  • Issues event: 3
  • Watch event: 36
  • Issue comment event: 1
  • Member event: 1
  • Push event: 65
  • Public event: 1
  • Pull request review event: 1
  • Pull request event: 20
  • Fork event: 2
  • Create event: 11

Dependencies

doomarena/browsergym/pyproject.toml pypi
  • agentlab *
  • browsergym *
  • doomarena >=0.0.4
  • playwright *
doomarena/core/pyproject.toml pypi
  • PyYAML *
  • litellm *
  • pydantic *
  • pytest *
  • python-dotenv *
  • tenacity *
  • tqdm *
doomarena/taubench/pyproject.toml pypi
  • doomarena >=0.0.4