rebuilding-rome

https://github.com/scalable-model-editing/rebuilding-rome

Science Score: 44.0%

This score indicates how likely this project is to be science-related based on various indicators:

✓
CITATION.cff file
Found CITATION.cff file
✓
codemeta.json file
Found codemeta.json file
✓
.zenodo.json file
Found .zenodo.json file
○
DOI references
○
Academic publication links
○
Academic email domains
○
Institutional organization owner
○
JOSS paper metadata
○
Scientific vocabulary similarity
Low similarity (11.7%) to scientific vocabulary

Last synced: 10 months ago · JSON representation ·

Repository

Basic Info

Host: GitHub
Owner: scalable-model-editing
License: mit
Language: Python
Default Branch: main
Size: 22.3 MB

Statistics

Stars: 11
Watchers: 0
Forks: 1
Open Issues: 2
Releases: 0

Created over 2 years ago · Last pushed about 2 years ago

Metadata Files

Readme License Citation

Rebuilding ROME : Resolving Model Collapse during Model Editing

Changes to the update equation

We focus on the way the MLP keys ($k$ such that $Wk=v$) are computed. See the rome/compute_u.py and rome/compute_v.py scripts for details. The derived ROME update equation is

$$\hat{W}=W+\Lambda* \left(C^{-1} k*\right)^T$$

where

$$\Lambda* =\frac{v* - Wk* }{ (C^{-1} k* )^T k_* }$$

$$k* =\frac{1}{N} \sum{j=1}^N k\left(x_j+s\right).$$

$$k(x)= \sigma \left( W{f c}^{\left( l^* \right)} \gamma\left( a{[x], i}^{\left( l^* \right)} + h_{[x], i}^{\left( l^* -1 \right)} \right)\right)$$

Note that the optimization step to compute $v_*$ is based on $k_*$. The original ROME implementation, however, computes

$$\hat{W}=W+\Lambda\left(C^{-1} k_*\right)^T$$

where

$$\Lambda=\frac{v_* -Wk}{(C^{-1} k)^T k}$$ $$k = k(s)$$

We find that the latter leads to rapid degradation in model performance in a sequential editing setting, and prone to particular edits known as disabling edits that render the model unusable post-update. Our experiments focus on unifying the computation of the keys in the update equation, and we study the use of $k$ and $k_*$.

Installation

We recommend using Docker to set up a clean dev environment.

docker compose up -d --build

To download the datasets used for evaluation, install Git LFS if needed:

shell curl -s https://packagecloud.io/install/repositories/github/git-lfs/script.deb.sh | sudo bash sudo apt-get install -y git-lfs git lfs pull

Running the experiments

The script supports sequential editing with the --sequential flag. With sequential editing, the edited model is evaluated for downstream task performance on 4 GLUE datasets after every 20 edits. The interval can be changed within the code-base.

You can evaluate either GPT2-XL or GPTJ-6B using the appropriate hyperparameter file to configure how the update equation is computed.

python python experiments/evaluate.py \ --model_name=${MODEL_NAME} \ --hparams_fname=${HPARAM_FILE_NAME} \ --ds_name=cf \ --sequential

How to Cite

If you find our work useful, please cite it using the following:

bibtex @article{gupta2024rebuilding, title={Rebuilding ROME: Resolving Model Collapse during Sequential Model Editing}, author={Gupta, Akshat and Anumanchipalli, Gopala}, journal={arXiv preprint arXiv:2403.07175}, year={2024} }

bibtex @article{gupta2024model, title={Model Editing at Scale leads to Gradual and Catastrophic Forgetting}, author={Gupta, Akshat and Rao, Anurag and Anumanchipalli, Gopala}, journal={arXiv preprint arXiv:2401.07453}, year={2024} }

Owner

Name: scalable-model-editing
Login: scalable-model-editing
Kind: organization

Repositories: 1
Profile: https://github.com/scalable-model-editing

Citation (CITATION.cff)

cff-version: 1.2.0
message: "If you use this software, please cite it as below."
preferred-citation:
  type: article
  authors:
  - family-names: "Meng"
    given-names: "Kevin"
  - family-names: "Bau"
    given-names: "David"
  - family-names: "Andonian"
    given-names: "Alex"
  - family-names: "Belinkov"
    given-names: "Yonatan"
  journal: "arXiv preprint arXiv:2202.05262"
  title: "Locating and Editing Factual Associations in GPT"
  year: 2022

GitHub Events

Total

Issues event: 2
Watch event: 3
Fork event: 2

Last Year

Issues event: 2
Watch event: 3
Fork event: 2

Dependencies

baselines/mend/requirements.txt pypi

allennlp *
click ==7.1.2
datasets *
hydra-core *
jsonlines *
numpy *
spacy *
torch *
wandb *

Dockerfile docker

nvidia/cuda 11.8.0-devel-ubuntu22.04 build

docker-compose.yml docker

baselines/kn/knowledge_neurons/requirements.txt pypi

einops *
numpy *
seaborn *
torch *
transformers *

baselines/kn/knowledge_neurons/setup.py pypi

transformers *

requirements.txt pypi

datasets *
higher *
hydra-core *
ipykernel *
jupyter *
matplotlib *
nltk *
numpy *
scikit-learn ==1.0.2
scipy *
seaborn *
transformers *

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Open Source Science