https://github.com/chaoscodes/propetl

One Network, Many Masks: Towards More Parameter-Efficient Transfer Learning

Science Score: 36.0%

This score indicates how likely this project is to be science-related based on various indicators:

○
CITATION.cff file
✓
codemeta.json file
Found codemeta.json file
✓
.zenodo.json file
Found .zenodo.json file
○
DOI references
✓
Academic publication links
Links to: arxiv.org
○
Academic email domains
○
Institutional organization owner
○
JOSS paper metadata
○
Scientific vocabulary similarity
Low similarity (13.7%) to scientific vocabulary

Last synced: 6 months ago · JSON representation

Repository

One Network, Many Masks: Towards More Parameter-Efficient Transfer Learning

Basic Info

Host: GitHub
Owner: ChaosCodes
License: apache-2.0
Language: Python
Default Branch: main
Size: 4.11 MB

Statistics

Stars: 39
Watchers: 1
Forks: 3
Open Issues: 0
Releases: 0

Created almost 3 years ago · Last pushed over 2 years ago

Metadata Files

Readme

One Network, Many Masks: Towards More Parameter-Efficient Transfer Learning

This repository is the official implementation of our ACL'23 paper "One Network, Many Masks: Towards More Parameter-Efficient Transfer Learning". [ArXiv]

Overview

ProPetl

Fine-tuning pre-trained language models for multiple tasks tends to be expensive in terms of storage. To mitigate this, parameter-efficient transfer learning (PETL) methods have been proposed to address this issue, but they still require a significant number of parameters and storage when being applied to broader ranges of tasks. To achieve even greater storage reduction, we propose PROPETL, a novel method that enables efficient sharing of a single PETL module which we call prototype network (e.g., adapter, LoRA, and prefix-tuning) across layers and tasks. We then learn binary masks to select different sub-networks from the shared prototype network and apply them as PETL modules into different layers. We find that the binary masks can determine crucial information from the network, which is often ignored in previous studies. Our work can also be seen as a type of pruning method, where we find that overparameterization also exists in the seemingly small PETL modules. We evaluate PROPETL on various downstream tasks and show that it can outperform other PETL methods with approximately 10% of the parameter storage required by the latter.

How to reproduce our work

In our work, we have conducted experiments on both encoder-only models (RoBERTa) and encoder-decoder models (T5).

Our work is divided into two main sections: - The first section focuses on the RoBERTa experiments using different variations including ProAdapter, ProLoRA, and ProPrefix. - The second section, ProPETL-T5, applies our methods on the T5 model.

Prerequisites

Ensure that you have a suitable environment to run our code. We primarily use Pytorch 1.11.0+cu113 and A100. You can install the necessary requirements for each part of the codebase individually using the provided requirements.txt files.

Reproducing the Experiments

Encoder-Only (RoBERTa) Experiments

Navigate to the propelt-roberta directory, install the requirements, and then run the model using the provided scripts. These scripts are prepared to reproduce the experiments mentioned in Table 1 of our paper.

bash cd roberta_directory python install -r requirements python install . bash scripts/run_adapter.sh # for ProAdapter bash scripts/run_lora.sh # for ProLoRA bash scripts/run_prefix.sh # for ProPrefix For more infirmation, please refer to the README file in the propelt-roberta directory.

Encoder-Decoder (T5) Experiments

Navigate to the propelt-t5 directory, install the requirements, and then run the model using the provided scripts. For example, below script is prepared to reproduce the experiments mentioned in Table 2 and Figure 3 of our paper.

bash cd propelt-t5 pip install -r requirements.txt CUDA_VISIBLE_DEVICES=0 python3 finetune_t5_trainer.py configs/glue/propetl_adapter_reduction12.json 42 # replace with desired config and random seed

For other configs, please refer to the README file in the propelt-t5 directory.

Reference

If you find this repository useful, please cite our paper: @inproceedings{zeng2023onenetwork, title={One Network, Many Masks: Towards More Parameter-Efficient Transfer Learning}, author={Guangtao Zeng and Peiyuan Zhang and Wei Lu}, booktitle={Proceedings of ACL}, year={2023} }

Owner

Login: ChaosCodes
Kind: user

Repositories: 23
Profile: https://github.com/ChaosCodes

GitHub Events

Total

Watch event: 3
Fork event: 1

Last Year

Watch event: 3
Fork event: 1

Issues and Pull Requests

Last synced: 11 months ago

All Time

Total issues: 0
Total pull requests: 0
Average time to close issues: N/A
Average time to close pull requests: N/A
Total issue authors: 0
Total pull request authors: 0
Average comments per issue: 0
Average comments per pull request: 0
Merged pull requests: 0
Bot issues: 0
Bot pull requests: 0

Past Year

Issues: 0
Pull requests: 0
Average time to close issues: N/A
Average time to close pull requests: N/A
Issue authors: 0
Pull request authors: 0
Average comments per issue: 0
Average comments per pull request: 0
Merged pull requests: 0
Bot issues: 0
Bot pull requests: 0

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Open Source Science

https://github.com/chaoscodes/propetl

Science Score: 36.0%

Repository

Basic Info

Statistics

Metadata Files

README.md

One Network, Many Masks: Towards More Parameter-Efficient Transfer Learning

Overview

How to reproduce our work

Prerequisites

Reproducing the Experiments

Encoder-Only (RoBERTa) Experiments

Encoder-Decoder (T5) Experiments

Reference

Owner

GitHub Events

Total

Last Year

Issues and Pull Requests

All Time

Past Year

Top Authors

Issue Authors

Pull Request Authors

Top Labels

Issue Labels

Pull Request Labels