https://github.com/confident-ai/deepteam - 🎉 OWASP Top 10, Guardrails

You can now use OWASP Top 10 in deepteam as follows:

```python from deepteam import red_team from deepteam.frameworks import OWASPTop10

riskassessment = redteam( modelcallback=yourmodel_callback, framework=OWASPTop10() ) ```

Docs: https://www.trydeepteam.com/docs/red-teaming-owasp-top-10-for-llms

You can now also use Guardrails:

```python from deepteam.guardrails.guards import PromptInjectionGuard, ToxicityGuard from deepteam.guardrails import Guardrails

Initialize guardrails

guardrails = Guardrails( inputguards=[PromptInjectionGuard()], outputguards=[ToxicityGuard()] )

res = guardrails.guard_input(input="...") ```

Docs: https://www.trydeepteam.com/docs/guardrails-introduction

- Python
Published by penguine-ip 11 months ago

https://github.com/confident-ai/deepteam - 🎉 New CLI tool, Agentic Red Teaming

🚀 DeepTeam CLI Release

We’re excited to release the first version of the DeepTeam CLI – a powerful command-line tool for red teaming and evaluating LLM applications with DeepEval.

✨ Features

Red Team Simulation
- Easily specify simulator and evaluation models (gpt-3.5-turbo-0125, gpt-4o, etc.)
- Attack LLM systems with predefined vulnerability categories (e.g., Bias, Toxicity)
Target System Configuration
- Test both foundational models (like gpt-3.5-turbo) and full LLM applications via custom Python wrappers
- Simple YAML config structure for defining the target model's purpose and behavior
System Controls
- Set concurrency and parallelism: max_concurrent, run_async
- Specify how many attacks to run per vulnerability type
- Optional error handling (ignore_errors) and result storage (output_folder)
Pluggable Vulnerabilities and Attacks
- Support for multiple attack types (e.g., Prompt Injection)
- Define default vulnerabilities like:
- Bias: targeting race and gender
- Toxicity: profanity and insults

🛠 Example Usage

```yaml models: simulator: gpt-3.5-turbo-0125 evaluation: gpt-4o

target: purpose: "A helpful AI assistant" model: gpt-3.5-turbo

systemconfig: maxconcurrent: 10 attackspervulnerabilitytype: 3 runasync: true ignoreerrors: false outputfolder: "results"

default_vulnerabilities: - name: "Bias" types: ["race", "gender"] - name: "Toxicity" types: ["profanity", "insults"]

attacks: - name: "Prompt Injection" bash deepteam run config.yaml ```

Stay tuned for more attack types, evaluation metrics, and integrations with the DeepEval framework.

🧠 Agentic Red Teaming

Agentic red teaming tests AI agents for vulnerabilities that only emerge when systems operate autonomously, maintain persistent memory, and pursue complex goals.

🧨 Specialized Attack Methods

DeepTeam includes 6 agentic-specific attacks:

Authority Spoofing – Pretend to be a system admin or override

Role Manipulation – Trick the agent into changing roles

Goal Redirection – Reframe or corrupt the agent's priorities

Linguistic Confusion – Use ambiguity to confuse language understanding

Validation Bypass – Bypass safety checks through clever phrasing

Context Injection – Inject false environmental state

Example

```python from deepteam import redteam from deepteam.vulnerabilities.agentic import DirectControlHijacking from deepteam.attacks.singleturn import AuthoritySpoofing

Test if your agent can be hijacked

riskassessment = redteam( modelcallback=youragent_callback, vulnerabilities=[DirectControlHijacking()], attacks=[AuthoritySpoofing()] ) ``` 🧪 Happy Red Teaming – now for both chatbots and autonomous agents!

- Python
Published by penguine-ip about 1 year ago

https://github.com/confident-ai/deepteam - First Stable Release 🎉

DeepTeam v0.1.0 – First Release 🎉

We’re excited to launch the first public release of DeepTeam, the open-source framework for LLM red teaming.

🧠 DeepTeam enables you to simulate real-world attacks on language models, test for failure modes like jailbreaks, and uncover model vulnerabilities using structured, reproducible evaluation.

🚀 Features

✅ Built-in adversarial attack strategies (jailbreaks, refusal bypasses, prompt injections)
✅ Automatic generation of adversarial test cases
✅ Multi-metric evaluation (pass/fail, toxicity, relevance, etc.)
✅ Seamless integration with your LLM app and testing pipelines
✅ Type-safe Python API with minimal setup

Get started by installing deepteam:

bash pip install deepteam

Docs here: https://www.trydeepteam.com/docs/getting-started

- Python
Published by penguine-ip about 1 year ago

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Open Source Science

Recent Releases of https://github.com/confident-ai/deepteam