Science Score: 44.0%

This score indicates how likely this project is to be science-related based on various indicators:

  • CITATION.cff file
    Found CITATION.cff file
  • codemeta.json file
    Found codemeta.json file
  • .zenodo.json file
    Found .zenodo.json file
  • DOI references
  • Academic publication links
  • Academic email domains
  • Institutional organization owner
  • JOSS paper metadata
  • Scientific vocabulary similarity
    Low similarity (11.7%) to scientific vocabulary
Last synced: 6 months ago · JSON representation ·

Repository

Basic Info
  • Host: GitHub
  • Owner: scalable-model-editing
  • License: mit
  • Language: Python
  • Default Branch: main
  • Size: 22.3 MB
Statistics
  • Stars: 11
  • Watchers: 0
  • Forks: 1
  • Open Issues: 2
  • Releases: 0
Created almost 2 years ago · Last pushed almost 2 years ago
Metadata Files
Readme License Citation

README.md

Rebuilding ROME : Resolving Model Collapse during Model Editing

Changes to the update equation

We focus on the way the MLP keys ($k$ such that $Wk=v$) are computed. See the rome/compute_u.py and rome/compute_v.py scripts for details. The derived ROME update equation is

$$\hat{W}=W+\Lambda* \left(C^{-1} k*\right)^T$$

where

$$\Lambda* =\frac{v* - Wk* }{ (C^{-1} k* )^T k_* }$$

$$k* =\frac{1}{N} \sum{j=1}^N k\left(x_j+s\right).$$

$$k(x)= \sigma \left( W{f c}^{\left( l^* \right)} \gamma\left( a{[x], i}^{\left( l^* \right)} + h_{[x], i}^{\left( l^* -1 \right)} \right)\right)$$

Note that the optimization step to compute $v_*$ is based on $k_*$. The original ROME implementation, however, computes

$$\hat{W}=W+\Lambda\left(C^{-1} k_*\right)^T$$

where

$$\Lambda=\frac{v_* -Wk}{(C^{-1} k)^T k}$$ $$k = k(s)$$

We find that the latter leads to rapid degradation in model performance in a sequential editing setting, and prone to particular edits known as disabling edits that render the model unusable post-update. Our experiments focus on unifying the computation of the keys in the update equation, and we study the use of $k$ and $k_*$.

Installation

We recommend using Docker to set up a clean dev environment.

docker compose up -d --build

To download the datasets used for evaluation, install Git LFS if needed:

shell curl -s https://packagecloud.io/install/repositories/github/git-lfs/script.deb.sh | sudo bash sudo apt-get install -y git-lfs git lfs pull

Running the experiments

The script supports sequential editing with the --sequential flag. With sequential editing, the edited model is evaluated for downstream task performance on 4 GLUE datasets after every 20 edits. The interval can be changed within the code-base.

You can evaluate either GPT2-XL or GPTJ-6B using the appropriate hyperparameter file to configure how the update equation is computed.

python python experiments/evaluate.py \ --model_name=${MODEL_NAME} \ --hparams_fname=${HPARAM_FILE_NAME} \ --ds_name=cf \ --sequential

How to Cite

If you find our work useful, please cite it using the following:

bibtex @article{gupta2024rebuilding, title={Rebuilding ROME: Resolving Model Collapse during Sequential Model Editing}, author={Gupta, Akshat and Anumanchipalli, Gopala}, journal={arXiv preprint arXiv:2403.07175}, year={2024} }

bibtex @article{gupta2024model, title={Model Editing at Scale leads to Gradual and Catastrophic Forgetting}, author={Gupta, Akshat and Rao, Anurag and Anumanchipalli, Gopala}, journal={arXiv preprint arXiv:2401.07453}, year={2024} }

Owner

  • Name: scalable-model-editing
  • Login: scalable-model-editing
  • Kind: organization

Citation (CITATION.cff)

cff-version: 1.2.0
message: "If you use this software, please cite it as below."
preferred-citation:
  type: article
  authors:
  - family-names: "Meng"
    given-names: "Kevin"
  - family-names: "Bau"
    given-names: "David"
  - family-names: "Andonian"
    given-names: "Alex"
  - family-names: "Belinkov"
    given-names: "Yonatan"
  journal: "arXiv preprint arXiv:2202.05262"
  title: "Locating and Editing Factual Associations in GPT"
  year: 2022

GitHub Events

Total
  • Issues event: 2
  • Watch event: 3
  • Fork event: 2
Last Year
  • Issues event: 2
  • Watch event: 3
  • Fork event: 2

Dependencies

baselines/mend/requirements.txt pypi
  • allennlp *
  • click ==7.1.2
  • datasets *
  • hydra-core *
  • jsonlines *
  • numpy *
  • spacy *
  • torch *
  • wandb *
Dockerfile docker
  • nvidia/cuda 11.8.0-devel-ubuntu22.04 build
docker-compose.yml docker
baselines/kn/knowledge_neurons/requirements.txt pypi
  • einops *
  • numpy *
  • seaborn *
  • torch *
  • transformers *
baselines/kn/knowledge_neurons/setup.py pypi
  • transformers *
requirements.txt pypi
  • datasets *
  • higher *
  • hydra-core *
  • ipykernel *
  • jupyter *
  • matplotlib *
  • nltk *
  • numpy *
  • scikit-learn ==1.0.2
  • scipy *
  • seaborn *
  • transformers *