naacl2024-fun

Fisher information based schedule unfreezing

https://github.com/ukplab/naacl2024-fun

Science Score: 65.0%

This score indicates how likely this project is to be science-related based on various indicators:

  • CITATION.cff file
    Found CITATION.cff file
  • codemeta.json file
    Found codemeta.json file
  • .zenodo.json file
    Found .zenodo.json file
  • DOI references
    Found 1 DOI reference(s) in README
  • Academic publication links
  • Committers with academic emails
  • Institutional organization owner
    Organization ukplab has institutional domain (www.ukp.tu-darmstadt.de)
  • JOSS paper metadata
  • Scientific vocabulary similarity
    Low similarity (13.1%) to scientific vocabulary
Last synced: 7 months ago · JSON representation ·

Repository

Fisher information based schedule unfreezing

Basic Info
  • Host: GitHub
  • Owner: UKPLab
  • License: other
  • Language: Python
  • Default Branch: main
  • Size: 6.8 MB
Statistics
  • Stars: 1
  • Watchers: 6
  • Forks: 0
  • Open Issues: 0
  • Releases: 0
Created about 3 years ago · Last pushed about 1 year ago
Metadata Files
Readme Contributing License Code of conduct Citation

README.md

FUN with Fisher: Improving Generalization of Adapter-Based Cross-lingual Transfer with Scheduled Unfreezing

This project contains experiment code for the respective publication: training task adapters by progressively unfreezing them during the training. This is built on a Fork of the Hugging Face Transformers library and the adapter-transformer library.

https://www.ukp.tu-darmstadt.de/

https://www.tu-darmstadt.de/

This repository contains experimental software and is published for the sole purpose of giving additional background details on the respective publication. If you encounter any issues, please do not hesitate to email us at: chen.liu AT tu-darmstadt DOT de

Installation

Please see the original Adapter_README.md for adapter-transformers. From source by cloning the repository:

git clone cd naccl2024-fun pip install -e .

Getting Started

The scripts for running our experiments are:

  1. examples/pytorch/question-answering/run_qa.py - XQuAD/MLQA
  2. examples/pytorch/multiple-choice/run_copa.py - XCOPA
  3. examples/pytorch/text-classification/run_xnli.py - XNLI

Note: Please comment out the import wandb if you don't wish to send results to wandb. Alternatively, to enable logging to wandb, you need to add something like this wandb.init(project="myproject", entity="abcde") to the scripts.

Other than the args from the original adapter transformers, there are several important args: 1. --freeze_all [bool] This option will freeze all layers for "roberta" or "bert" based-model. This is required to use all scheduled unfreezing.

  1. --use_gu [bool] This option allows you to run gradual unfreezing on predetermined intervals.

  2. --use_schedule [bool] This option allows you to turn on other scheduled unfreezing method.

  3. --unfreeze_interval [str] Unfreezing interval. Available ones are: 50-12, 100-12, 800-12, 1000-12.

  4. --schedule_style [str] This option allows you to choose from one of the schduled unfreezing: "lpft", "one", "rand" or a pre-setted schedule (e.g. "schedule-1"). Note, you don't need this to run gradual_unfreezing.

  5. --exp_name [str] Experiment name to report to wandb.

To run an experiment

Assume you want to run on qa dataset, with gradual unfreezing: python run_qa.py \ --model_name_or_path xlm-roberta-base \ --dataset_name squad \ --seed $seed \ --do_train \ --do_eval \ --freeze_all \ # feeze all the adapters initially --use_gu \ # use gradual unfreezing --train_adapter \ --adapter_config "pfeiffer+inv" \ --load_lang_adapter en/wiki@ukp \ --language en \ --per_device_train_batch_size 32 \ --per_device_eval_batch_size 32 \ --learning_rate 5e-4 \ --num_train_epochs 15 \ --overwrite_output_dir \ --save_total_limit 2 \ --load_best_model_at_end \ --evaluation_strategy steps \ --metric_for_best_model eval_f1 \ --greater_is_better True \ --max_seq_length 384 \ --doc_stride 128 \ --exp_name xlmr_squad_gu_$seed \ --output_dir squad_gu_$seed

If you want to run on QA dataset, with FUN, unfreeze at every 50 steps:

python run_qa.py \ --model_name_or_path xlm-roberta-base \ --dataset_name squad \ --seed $seed \ --do_train \ --do_eval \ --freeze_all \ # feeze all the adapters initially --use_schedule \ # use a schedule other than gu --schedule_style one \ # unfreeze one adapter at a time using tr(F) --unfreeze_interval 50-12 \ # unfreeze an adapter every 50 steps --train_adapter \ --adapter_config "pfeiffer+inv" \ --load_lang_adapter en/wiki@ukp \ --language en \ --per_device_train_batch_size 32 \ --per_device_eval_batch_size 32 \ --learning_rate 5e-4 \ --num_train_epochs 15 \ --overwrite_output_dir \ --save_total_limit 2 \ --load_best_model_at_end \ --evaluation_strategy steps \ --metric_for_best_model eval_f1 \ --greater_is_better True \ --max_seq_length 384 \ --doc_stride 128 \ --exp_name xlmr_squad_fun_$seed \ --output_dir squad_fun_$seed

Citation

If you use this for your work, please consider citing our paper, as well as the AdapterHub. @inproceedings{liu-etal-2024-fun, title = "{FUN} with Fisher: Improving Generalization of Adapter-Based Cross-lingual Transfer with Scheduled Unfreezing", author = "Liu, Chen and Pfeiffer, Jonas and Vuli{\'c}, Ivan and Gurevych, Iryna", editor = "Duh, Kevin and Gomez, Helena and Bethard, Steven", booktitle = "Proceedings of the 2024 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (Volume 1: Long Papers)", month = jun, year = "2024", address = "Mexico City, Mexico", publisher = "Association for Computational Linguistics", url = "https://aclanthology.org/2024.naacl-long.111/", doi = "10.18653/v1/2024.naacl-long.111", pages = "1998--2015"}

Owner

  • Name: Ubiquitous Knowledge Processing Lab
  • Login: UKPLab
  • Kind: organization
  • Location: Darmstadt, Germany

Citation (CITATION.cff)

cff-version: "1.2.0"
date-released: 2020-10
message: "If you use this software, please cite it as below."
title: "AdapterHub: A Framework for Adapting Transformers"
url: "https://github.com/Adapter-Hub/adapter-transformers"
authors: 
  - family-names: Pfeiffer
    given-names: Jonas
  - family-names: Rücklé
    given-names: Andreas
  - family-names: Poth
    given-names: Clifton
  - family-names: Kamath
    given-names: Aishwarya
  - family-names: Vulić
    given-names: Ivan
  - family-names: Ruder
    given-names: Sebastian
  - family-names: Cho
    given-names: Kyunghyun
  - family-names: Gurevych
    given-names: Iryna
preferred-citation:
  type: inproceedings
  authors:
  - family-names: Pfeiffer
    given-names: Jonas
  - family-names: Rücklé
    given-names: Andreas
  - family-names: Poth
    given-names: Clifton
  - family-names: Kamath
    given-names: Aishwarya
  - family-names: Vulić
    given-names: Ivan
  - family-names: Ruder
    given-names: Sebastian
  - family-names: Cho
    given-names: Kyunghyun
  - family-names: Gurevych
    given-names: Iryna
  booktitle: "Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing: System Demonstrations"
  month: 10
  start: 46
  end: 54
  title: "AdapterHub: A Framework for Adapting Transformers"
  year: 2020
  publisher: "Association for Computational Linguistics"
  url: "https://aclanthology.org/2020.emnlp-demos.7"
  address: "Online"

GitHub Events

Total
  • Push event: 1
Last Year
  • Push event: 1

Committers

Last synced: 10 months ago

All Time
  • Total Commits: 2
  • Total Committers: 1
  • Avg Commits per committer: 2.0
  • Development Distribution Score (DDS): 0.0
Past Year
  • Commits: 1
  • Committers: 1
  • Avg Commits per committer: 1.0
  • Development Distribution Score (DDS): 0.0
Top Committers
Name Email Commits
ccliu2 c****u@g****m 2

Issues and Pull Requests

Last synced: 10 months ago

All Time
  • Total issues: 0
  • Total pull requests: 0
  • Average time to close issues: N/A
  • Average time to close pull requests: N/A
  • Total issue authors: 0
  • Total pull request authors: 0
  • Average comments per issue: 0
  • Average comments per pull request: 0
  • Merged pull requests: 0
  • Bot issues: 0
  • Bot pull requests: 0
Past Year
  • Issues: 0
  • Pull requests: 0
  • Average time to close issues: N/A
  • Average time to close pull requests: N/A
  • Issue authors: 0
  • Pull request authors: 0
  • Average comments per issue: 0
  • Average comments per pull request: 0
  • Merged pull requests: 0
  • Bot issues: 0
  • Bot pull requests: 0
Top Authors
Issue Authors
Pull Request Authors
Top Labels
Issue Labels
Pull Request Labels

Dependencies

docker/transformers-all-latest-gpu/Dockerfile docker
  • nvidia/cuda 11.2.2-cudnn8-devel-ubuntu20.04 build
docker/transformers-cpu/Dockerfile docker
  • ubuntu 18.04 build
docker/transformers-doc-builder/Dockerfile docker
  • python 3.8 build
docker/transformers-gpu/Dockerfile docker
  • nvidia/cuda 10.2-cudnn7-devel-ubuntu18.04 build
docker/transformers-pytorch-cpu/Dockerfile docker
  • ubuntu 18.04 build
docker/transformers-pytorch-deepspeed-latest-gpu/Dockerfile docker
  • nvcr.io/nvidia/pytorch 21.03-py3 build
docker/transformers-pytorch-gpu/Dockerfile docker
  • nvidia/cuda 11.2.2-cudnn8-devel-ubuntu20.04 build
docker/transformers-pytorch-tpu/Dockerfile docker
  • google/cloud-sdk slim build
docker/transformers-tensorflow-cpu/Dockerfile docker
  • ubuntu 18.04 build
docker/transformers-tensorflow-gpu/Dockerfile docker
  • nvidia/cuda 11.2.2-cudnn8-devel-ubuntu20.04 build
examples/pytorch/_tests_requirements.txt pypi
  • accelerate main test
  • conllu * test
  • datasets >=1.13.3 test
  • elasticsearch * test
  • faiss-cpu * test
  • fire * test
  • git-python ==1.0.3 test
  • jiwer * test
  • librosa * test
  • matplotlib * test
  • nltk * test
  • pandas * test
  • protobuf * test
  • psutil * test
  • pytest * test
  • rouge-score * test
  • sacrebleu >=1.4.12 test
  • scikit-learn * test
  • sentencepiece * test
  • seqeval * test
  • streamlit * test
  • tensorboard * test
  • tensorflow_datasets * test
  • torchvision * test
examples/pytorch/audio-classification/requirements.txt pypi
  • datasets >=1.14.0
  • librosa *
  • torch >=1.6
  • torchaudio *
examples/pytorch/contrastive-image-text/requirements.txt pypi
  • datasets >=1.8.0
  • torch >=1.5.0
  • torchvision >=0.6.0
examples/pytorch/dependency-parsing/requirements.txt pypi
  • conllu *
  • datasets >=1.8.0
  • torch >=1.3
examples/pytorch/image-classification/requirements.txt pypi
  • datasets >=1.8.0
  • torch >=1.5.0
  • torchvision >=0.6.0
examples/pytorch/image-pretraining/requirements.txt pypi
  • datasets >=1.8.0
  • torch >=1.5.0
  • torchvision >=0.6.0
examples/pytorch/language-modeling/requirements.txt pypi
  • accelerate *
  • datasets >=1.8.0
  • protobuf *
  • sentencepiece *
  • torch >=1.3
examples/pytorch/question-answering/requirements.txt pypi
  • accelerate *
  • datasets >=1.8.0
  • torch >=1.3.0
examples/pytorch/semantic-segmentation/requirements.txt pypi
  • datasets >=2.0.0
  • torch >=1.3
examples/pytorch/speech-pretraining/requirements.txt pypi
  • accelerate >=0.5.0
  • datasets >=1.12.0
  • librosa *
  • torch >=1.5
  • torchaudio *
examples/pytorch/speech-recognition/requirements.txt pypi
  • datasets >=1.18.0
  • jiwer *
  • librosa *
  • torch >=1.5
  • torchaudio *
examples/pytorch/summarization/requirements.txt pypi
  • accelerate *
  • datasets >=1.8.0
  • nltk *
  • protobuf *
  • py7zr *
  • rouge-score *
  • sentencepiece *
  • torch >=1.3
examples/pytorch/text-generation/requirements.txt pypi
  • protobuf *
  • sentencepiece *
  • torch >=1.3
examples/pytorch/token-classification/requirements.txt pypi
  • accelerate *
  • datasets >=1.8.0
  • seqeval *
  • torch >=1.3
examples/pytorch/translation/requirements.txt pypi
  • accelerate *
  • datasets >=1.8.0
  • protobuf *
  • py7zr *
  • sacrebleu >=1.4.12
  • sentencepiece *
  • torch >=1.3
pyproject.toml pypi
setup.py pypi
  • deps *
tests/sagemaker/scripts/pytorch/requirements.txt pypi
  • datasets ==1.8.0 test
tests/sagemaker/scripts/tensorflow/requirements.txt pypi