naacl2024-fun

Fisher information based schedule unfreezing

https://github.com/ukplab/naacl2024-fun

Science Score: 65.0%

This score indicates how likely this project is to be science-related based on various indicators:

✓
CITATION.cff file
Found CITATION.cff file
✓
codemeta.json file
Found codemeta.json file
✓
.zenodo.json file
Found .zenodo.json file
✓
DOI references
Found 1 DOI reference(s) in README
○
Academic publication links
○
Committers with academic emails
✓
Institutional organization owner
Organization ukplab has institutional domain (www.ukp.tu-darmstadt.de)
○
JOSS paper metadata
○
Scientific vocabulary similarity
Low similarity (13.1%) to scientific vocabulary

Last synced: 10 months ago · JSON representation ·

Repository

Fisher information based schedule unfreezing

Basic Info

Host: GitHub
Owner: UKPLab
License: other
Language: Python
Default Branch: main
Size: 6.8 MB

Statistics

Stars: 1
Watchers: 6
Forks: 0
Open Issues: 0
Releases: 0

Created over 3 years ago · Last pushed over 1 year ago

Metadata Files

Readme Contributing License Code of conduct Citation

FUN with Fisher: Improving Generalization of Adapter-Based Cross-lingual Transfer with Scheduled Unfreezing

This project contains experiment code for the respective publication: training task adapters by progressively unfreezing them during the training. This is built on a Fork of the Hugging Face Transformers library and the adapter-transformer library.

https://www.ukp.tu-darmstadt.de/

https://www.tu-darmstadt.de/

This repository contains experimental software and is published for the sole purpose of giving additional background details on the respective publication. If you encounter any issues, please do not hesitate to email us at: chen.liu AT tu-darmstadt DOT de

Installation

Please see the original Adapter_README.md for adapter-transformers. From source by cloning the repository:

git clone cd naccl2024-fun pip install -e .

Getting Started

The scripts for running our experiments are:

examples/pytorch/question-answering/run_qa.py - XQuAD/MLQA
examples/pytorch/multiple-choice/run_copa.py - XCOPA
examples/pytorch/text-classification/run_xnli.py - XNLI

Note: Please comment out the import wandb if you don't wish to send results to wandb. Alternatively, to enable logging to wandb, you need to add something like this wandb.init(project="myproject", entity="abcde") to the scripts.

Other than the args from the original adapter transformers, there are several important args: 1. --freeze_all [bool] This option will freeze all layers for "roberta" or "bert" based-model. This is required to use all scheduled unfreezing.

--use_gu [bool] This option allows you to run gradual unfreezing on predetermined intervals.
--use_schedule [bool] This option allows you to turn on other scheduled unfreezing method.
--unfreeze_interval [str] Unfreezing interval. Available ones are: 50-12, 100-12, 800-12, 1000-12.
--schedule_style [str] This option allows you to choose from one of the schduled unfreezing: "lpft", "one", "rand" or a pre-setted schedule (e.g. "schedule-1"). Note, you don't need this to run gradual_unfreezing.
--exp_name [str] Experiment name to report to wandb.

To run an experiment

Assume you want to run on qa dataset, with gradual unfreezing: python run_qa.py \ --model_name_or_path xlm-roberta-base \ --dataset_name squad \ --seed $seed \ --do_train \ --do_eval \ --freeze_all \ # feeze all the adapters initially --use_gu \ # use gradual unfreezing --train_adapter \ --adapter_config "pfeiffer+inv" \ --load_lang_adapter en/wiki@ukp \ --language en \ --per_device_train_batch_size 32 \ --per_device_eval_batch_size 32 \ --learning_rate 5e-4 \ --num_train_epochs 15 \ --overwrite_output_dir \ --save_total_limit 2 \ --load_best_model_at_end \ --evaluation_strategy steps \ --metric_for_best_model eval_f1 \ --greater_is_better True \ --max_seq_length 384 \ --doc_stride 128 \ --exp_name xlmr_squad_gu_$seed \ --output_dir squad_gu_$seed

If you want to run on QA dataset, with FUN, unfreeze at every 50 steps:

python run_qa.py \ --model_name_or_path xlm-roberta-base \ --dataset_name squad \ --seed $seed \ --do_train \ --do_eval \ --freeze_all \ # feeze all the adapters initially --use_schedule \ # use a schedule other than gu --schedule_style one \ # unfreeze one adapter at a time using tr(F) --unfreeze_interval 50-12 \ # unfreeze an adapter every 50 steps --train_adapter \ --adapter_config "pfeiffer+inv" \ --load_lang_adapter en/wiki@ukp \ --language en \ --per_device_train_batch_size 32 \ --per_device_eval_batch_size 32 \ --learning_rate 5e-4 \ --num_train_epochs 15 \ --overwrite_output_dir \ --save_total_limit 2 \ --load_best_model_at_end \ --evaluation_strategy steps \ --metric_for_best_model eval_f1 \ --greater_is_better True \ --max_seq_length 384 \ --doc_stride 128 \ --exp_name xlmr_squad_fun_$seed \ --output_dir squad_fun_$seed

Citation

If you use this for your work, please consider citing our paper, as well as the AdapterHub. @inproceedings{liu-etal-2024-fun, title = "{FUN} with Fisher: Improving Generalization of Adapter-Based Cross-lingual Transfer with Scheduled Unfreezing", author = "Liu, Chen and Pfeiffer, Jonas and Vuli{\'c}, Ivan and Gurevych, Iryna", editor = "Duh, Kevin and Gomez, Helena and Bethard, Steven", booktitle = "Proceedings of the 2024 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (Volume 1: Long Papers)", month = jun, year = "2024", address = "Mexico City, Mexico", publisher = "Association for Computational Linguistics", url = "https://aclanthology.org/2024.naacl-long.111/", doi = "10.18653/v1/2024.naacl-long.111", pages = "1998--2015"}

Owner

Name: Ubiquitous Knowledge Processing Lab
Login: UKPLab
Kind: organization
Location: Darmstadt, Germany

Website: http://www.ukp.tu-darmstadt.de
Repositories: 175
Profile: https://github.com/UKPLab

Citation (CITATION.cff)

cff-version: "1.2.0"
date-released: 2020-10
message: "If you use this software, please cite it as below."
title: "AdapterHub: A Framework for Adapting Transformers"
url: "https://github.com/Adapter-Hub/adapter-transformers"
authors: 
  - family-names: Pfeiffer
    given-names: Jonas
  - family-names: Rücklé
    given-names: Andreas
  - family-names: Poth
    given-names: Clifton
  - family-names: Kamath
    given-names: Aishwarya
  - family-names: Vulić
    given-names: Ivan
  - family-names: Ruder
    given-names: Sebastian
  - family-names: Cho
    given-names: Kyunghyun
  - family-names: Gurevych
    given-names: Iryna
preferred-citation:
  type: inproceedings
  authors:
  - family-names: Pfeiffer
    given-names: Jonas
  - family-names: Rücklé
    given-names: Andreas
  - family-names: Poth
    given-names: Clifton
  - family-names: Kamath
    given-names: Aishwarya
  - family-names: Vulić
    given-names: Ivan
  - family-names: Ruder
    given-names: Sebastian
  - family-names: Cho
    given-names: Kyunghyun
  - family-names: Gurevych
    given-names: Iryna
  booktitle: "Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing: System Demonstrations"
  month: 10
  start: 46
  end: 54
  title: "AdapterHub: A Framework for Adapting Transformers"
  year: 2020
  publisher: "Association for Computational Linguistics"
  url: "https://aclanthology.org/2020.emnlp-demos.7"
  address: "Online"

GitHub Events

Total

Push event: 1

Last Year

Push event: 1

Committers

Last synced: about 1 year ago

All Time

Total Commits: 2
Total Committers: 1
Avg Commits per committer: 2.0
Development Distribution Score (DDS): 0.0

Past Year

Commits: 1
Committers: 1
Avg Commits per committer: 1.0
Development Distribution Score (DDS): 0.0

Top Committers

Name	Email	Commits
ccliu2	c**u@g**m	2

Issues and Pull Requests

Last synced: about 1 year ago

All Time

Total issues: 0
Total pull requests: 0
Average time to close issues: N/A
Average time to close pull requests: N/A
Total issue authors: 0
Total pull request authors: 0
Average comments per issue: 0
Average comments per pull request: 0
Merged pull requests: 0
Bot issues: 0
Bot pull requests: 0

Past Year

Issues: 0
Pull requests: 0
Average time to close issues: N/A
Average time to close pull requests: N/A
Issue authors: 0
Pull request authors: 0
Average comments per issue: 0
Average comments per pull request: 0
Merged pull requests: 0
Bot issues: 0
Bot pull requests: 0

View more stats

Top Authors

Issue Authors

Pull Request Authors

Top Labels

Issue Labels

Pull Request Labels

Dependencies

docker/transformers-all-latest-gpu/Dockerfile docker

nvidia/cuda 11.2.2-cudnn8-devel-ubuntu20.04 build

docker/transformers-cpu/Dockerfile docker

ubuntu 18.04 build

docker/transformers-doc-builder/Dockerfile docker

python 3.8 build

docker/transformers-gpu/Dockerfile docker

nvidia/cuda 10.2-cudnn7-devel-ubuntu18.04 build

docker/transformers-pytorch-cpu/Dockerfile docker

ubuntu 18.04 build

docker/transformers-pytorch-deepspeed-latest-gpu/Dockerfile docker

nvcr.io/nvidia/pytorch 21.03-py3 build

docker/transformers-pytorch-gpu/Dockerfile docker

nvidia/cuda 11.2.2-cudnn8-devel-ubuntu20.04 build

docker/transformers-pytorch-tpu/Dockerfile docker

google/cloud-sdk slim build

docker/transformers-tensorflow-cpu/Dockerfile docker

ubuntu 18.04 build

docker/transformers-tensorflow-gpu/Dockerfile docker

nvidia/cuda 11.2.2-cudnn8-devel-ubuntu20.04 build

examples/pytorch/_tests_requirements.txt pypi

accelerate main test
conllu * test
datasets >=1.13.3 test
elasticsearch * test
faiss-cpu * test
fire * test
git-python ==1.0.3 test
jiwer * test
librosa * test
matplotlib * test
nltk * test
pandas * test
protobuf * test
psutil * test
pytest * test
rouge-score * test
sacrebleu >=1.4.12 test
scikit-learn * test
sentencepiece * test
seqeval * test
streamlit * test
tensorboard * test
tensorflow_datasets * test
torchvision * test

examples/pytorch/audio-classification/requirements.txt pypi

datasets >=1.14.0
librosa *
torch >=1.6
torchaudio *

examples/pytorch/contrastive-image-text/requirements.txt pypi

datasets >=1.8.0
torch >=1.5.0
torchvision >=0.6.0

examples/pytorch/dependency-parsing/requirements.txt pypi

conllu *
datasets >=1.8.0
torch >=1.3

examples/pytorch/image-classification/requirements.txt pypi

datasets >=1.8.0
torch >=1.5.0
torchvision >=0.6.0

examples/pytorch/image-pretraining/requirements.txt pypi

datasets >=1.8.0
torch >=1.5.0
torchvision >=0.6.0

examples/pytorch/language-modeling/requirements.txt pypi

accelerate *
datasets >=1.8.0
protobuf *
sentencepiece *
torch >=1.3

examples/pytorch/question-answering/requirements.txt pypi

accelerate *
datasets >=1.8.0
torch >=1.3.0

examples/pytorch/semantic-segmentation/requirements.txt pypi

datasets >=2.0.0
torch >=1.3

examples/pytorch/speech-pretraining/requirements.txt pypi

accelerate >=0.5.0
datasets >=1.12.0
librosa *
torch >=1.5
torchaudio *

examples/pytorch/speech-recognition/requirements.txt pypi

datasets >=1.18.0
jiwer *
librosa *
torch >=1.5
torchaudio *

examples/pytorch/summarization/requirements.txt pypi

accelerate *
datasets >=1.8.0
nltk *
protobuf *
py7zr *
rouge-score *
sentencepiece *
torch >=1.3

examples/pytorch/text-generation/requirements.txt pypi

protobuf *
sentencepiece *
torch >=1.3

examples/pytorch/token-classification/requirements.txt pypi

accelerate *
datasets >=1.8.0
seqeval *
torch >=1.3

examples/pytorch/translation/requirements.txt pypi

accelerate *
datasets >=1.8.0
protobuf *
py7zr *
sacrebleu >=1.4.12
sentencepiece *
torch >=1.3

pyproject.toml pypi

setup.py pypi

deps *

tests/sagemaker/scripts/pytorch/requirements.txt pypi

datasets ==1.8.0 test

tests/sagemaker/scripts/tensorflow/requirements.txt pypi

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Open Source Science

naacl2024-fun

Science Score: 65.0%

Repository

Basic Info

Statistics

Metadata Files

README.md

FUN with Fisher: Improving Generalization of Adapter-Based Cross-lingual Transfer with Scheduled Unfreezing

Installation

Getting Started

To run an experiment

Citation

Owner

Citation (CITATION.cff)

GitHub Events

Total

Last Year

Committers

All Time

Past Year

Top Committers

Issues and Pull Requests

All Time

Past Year

Top Authors

Issue Authors

Pull Request Authors

Top Labels

Issue Labels

Pull Request Labels

Dependencies