specificityplus
👩💻 Code for the ACL paper "Detecting Edit Failures in LLMs: An Improved Specificity Benchmark"
Science Score: 36.0%
This score indicates how likely this project is to be science-related based on various indicators:
-
○CITATION.cff file
-
✓codemeta.json file
Found codemeta.json file -
✓.zenodo.json file
Found .zenodo.json file -
○DOI references
-
○Academic publication links
-
✓Committers with academic emails
2 of 11 committers (18.2%) from academic institutions -
○Institutional organization owner
-
○JOSS paper metadata
-
○Scientific vocabulary similarity
Low similarity (8.8%) to scientific vocabulary
Keywords
Repository
👩💻 Code for the ACL paper "Detecting Edit Failures in LLMs: An Improved Specificity Benchmark"
Basic Info
- Host: GitHub
- Owner: apartresearch
- License: other
- Language: Python
- Default Branch: main
- Homepage: https://specificityplus.apartresearch.com
- Size: 71.6 MB
Statistics
- Stars: 20
- Watchers: 2
- Forks: 4
- Open Issues: 2
- Releases: 0
Topics
Metadata Files
README.md
Detecting Edit Failures in LLMs: An Improved Specificity Benchmark (website)
This repository contains the code for the paper Detecting Edit Failures in LLMs: An Improved Specificity Benchmark (ACL Findings 2023).
It extends previous work on model editing by Meng et al. [1] by introducing a new benchmark, called CounterFact+, for measuring the specificity of model edits.
Attribution
The repository is a fork of MEMIT, which implements the model editing algorithms MEMIT (Mass Editing Memory in a Transformer) and ROME (Rank-One Model Editing). Our fork extends this code by additional evaluation scripts implementing the CounterFact+ benchmark. For installation instructions see the original repository.
Installation
We recommend conda for managing Python, CUDA, and PyTorch; pip is for everything else. To get started, simply install conda and run:
bash
CONDA_HOME=$CONDA_HOME ./scripts/setup_conda.sh
$CONDA_HOME should be the path to your conda installation, e.g., ~/miniconda3.
Running Experiments
See INSTRUCTIONS.md for instructions on how to run the experiments and evaluations.
How to Cite
If you find our paper useful, please consider citing as:
```bibtex
@inproceedings{jason2023detecting, title = {Detecting Edit Failures In Large Language Models: An Improved Specificity Benchmark}, author = {Hoelscher-Obermaier, Jason and Persson, Julia and Kran, Esben and Konstas, Ionnis and Barez, Fazl}, booktitle = {Findings of ACL}, year = {2023}, organization = {Association for Computational Linguistics} }
Owner
- Name: apartresearch
- Login: apartresearch
- Kind: organization
- Email: operations@apartresearch.com
- Website: https://apartresearch.com
- Twitter: apartresearch
- Repositories: 5
- Profile: https://github.com/apartresearch
GitHub Events
Total
Last Year
Committers
Last synced: about 2 years ago
Top Committers
| Name | Commits | |
|---|---|---|
| JuliaHPersson | j****n@g****m | 242 |
| Jason Hoelscher-Obermaier | j****r@g****m | 104 |
| - | - | 55 |
| Esben Kran | e****n@k****i | 33 |
| Kevin Meng | k****1@g****m | 6 |
| Fazl Barez | s****9@l****k | 3 |
| Jason Hoelscher-Obermaier | j****o | 3 |
| fbarez | 3****z | 3 |
| David Bau | d****u@g****m | 2 |
| Julia Persson | 5****n | 2 |
| Fazl Barez | s****9@u****k | 1 |
Committer Domains (Top 20 + Academic)
Issues and Pull Requests
Last synced: about 2 years ago
All Time
- Total issues: 24
- Total pull requests: 39
- Average time to close issues: 20 days
- Average time to close pull requests: 2 days
- Total issue authors: 3
- Total pull request authors: 4
- Average comments per issue: 0.92
- Average comments per pull request: 0.33
- Merged pull requests: 36
- Bot issues: 0
- Bot pull requests: 0
Past Year
- Issues: 11
- Pull requests: 27
- Average time to close issues: about 1 month
- Average time to close pull requests: about 5 hours
- Issue authors: 3
- Pull request authors: 4
- Average comments per issue: 1.0
- Average comments per pull request: 0.37
- Merged pull requests: 26
- Bot issues: 0
- Bot pull requests: 0
Top Authors
Issue Authors
- jas-ho (20)
- fbarez (3)
- esbenkc (1)
Pull Request Authors
- jas-ho (25)
- JuliaHPersson (7)
- fbarez (5)
- esbenkc (3)
Top Labels
Issue Labels
Pull Request Labels
Dependencies
- allennlp *
- click ==7.1.2
- datasets *
- hydra-core *
- jsonlines *
- numpy *
- spacy *
- torch *
- wandb *