https://github.com/amenra/guardbench
A Python library for guardrail models evaluation.
Science Score: 36.0%
This score indicates how likely this project is to be science-related based on various indicators:
-
○CITATION.cff file
-
✓codemeta.json file
Found codemeta.json file -
○.zenodo.json file
-
✓DOI references
Found 1 DOI reference(s) in README -
✓Academic publication links
Links to: arxiv.org -
○Academic email domains
-
○Institutional organization owner
-
○JOSS paper metadata
-
○Scientific vocabulary similarity
Low similarity (14.8%) to scientific vocabulary
Keywords
Repository
A Python library for guardrail models evaluation.
Basic Info
Statistics
- Stars: 9
- Watchers: 1
- Forks: 0
- Open Issues: 0
- Releases: 0
Topics
Metadata Files
README.md
GuardBench
⚡️ Introduction
GuardBench is a Python library for the evaluation of guardrail models, i.e., LLMs fine-tuned to detect unsafe content in human-AI interactions.
GuardBench provides a common interface to 40 evaluation datasets, which are downloaded and converted into a standardized format for improved usability.
It also allows to quickly compare results and export LaTeX tables for scientific publications.
GuardBench's benchmarking pipeline can also be leveraged on custom datasets.
GuardBench was featured in EMNLP 2024.
The related paper is available here.
GuardBench has a public leaderboard available on HuggingFace.
You can find the list of supported datasets here. A few of them requires authorization. Please, read this.
If you use GuardBench to evaluate guardrail models for your scientific publications, please consider citing our work.
✨ Features
- 40 datasets for guardrail models evaluation.
- Automated evaluation pipeline.
- User-friendly.
- Extendable.
- Reproducible and sharable evaluation.
- Exportable evaluation reports.
🔌 Requirements
bash
python>=3.10
💾 Installation
bash
pip install guardbench
💡 Usage
```python from guardbench import benchmark
def moderate(
conversations: list[list[dict[str, str]]], # MANDATORY!
# additional kwargs as needed
) -> list[float]:
# do moderation
# return list of floats (unsafe probabilities)
benchmark(
moderate=moderate, # User-defined moderation function
modelname="My Guardrail Model",
batchsize=32,
datasets="all",
# Note: you can pass additional kwargs for moderate
)
```
📖 Examples
- Follow our tutorial on benchmarking
Llama GuardwithGuardBench. - More examples are available in the
scriptsfolder.
📚 Documentation
Browse the documentation for more details about:
- The datasets and how to obtain them.
- The data format used by GuardBench.
- How to use the Report class to compare models and export results as LaTeX tables.
- How to leverage GuardBench's benchmarking pipeline on custom datasets.
🏆 Leaderboard
You can find GuardBench's leaderboard here. If you want to submit your results, please contact us.
<!-- All results can be reproduced using the provided scripts. -->
👨💻 Authors
- Elias Bassani (European Commission - Joint Research Centre)
🎓 Citation
bibtex
@inproceedings{guardbench,
title = "{G}uard{B}ench: A Large-Scale Benchmark for Guardrail Models",
author = "Bassani, Elias and
Sanchez, Ignacio",
editor = "Al-Onaizan, Yaser and
Bansal, Mohit and
Chen, Yun-Nung",
booktitle = "Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing",
month = nov,
year = "2024",
address = "Miami, Florida, USA",
publisher = "Association for Computational Linguistics",
url = "https://aclanthology.org/2024.emnlp-main.1022",
doi = "10.18653/v1/2024.emnlp-main.1022",
pages = "18393--18409",
}
🎁 Feature Requests
Would you like to see other features implemented? Please, open a feature request.
📄 License
GuardBench is provided as open-source software licensed under EUPL v1.2.
Owner
- Name: Elias Bassani
- Login: AmenRa
- Kind: user
- Location: Milan, Italy
- Company: Joint Research Centre
- Website: amenra.github.io/eliasbassani
- Twitter: elias_bssn
- Repositories: 28
- Profile: https://github.com/AmenRa
Ph.D. in CS. I like Neural Networks, usability, efficiency, einsum, memes, and improperly used emojis. 🫠
GitHub Events
Total
- Watch event: 15
- Push event: 10
- Pull request event: 1
Last Year
- Watch event: 15
- Push event: 10
- Pull request event: 1
Packages
- Total packages: 1
-
Total downloads:
- pypi 30 last-month
- Total dependent packages: 0
- Total dependent repositories: 0
- Total versions: 2
- Total maintainers: 1
pypi.org: guardbench
GuardBench: A Large-Scale Benchmark for Guardrail Models
- Documentation: https://guardbench.readthedocs.io/
- License: MIT License
-
Latest release: 1.0.0
published over 1 year ago