https://github.com/amazon-science/mezo_svrg

Code the ICML 2024 paper: "Variance-reduced Zeroth-Order Methods for Fine-Tuning Language Models"

Science Score: 26.0%

This score indicates how likely this project is to be science-related based on various indicators:

○
CITATION.cff file
✓
codemeta.json file
Found codemeta.json file
✓
.zenodo.json file
Found .zenodo.json file
○
DOI references
○
Academic publication links
○
Committers with academic emails
○
Institutional organization owner
○
JOSS paper metadata
○
Scientific vocabulary similarity
Low similarity (12.5%) to scientific vocabulary

Keywords

deep-learning fine-tuning language-model large-language-models llm-training llms machine-learning machine-learning-algorithms optimization optimization-algorithms svrg variance-reduction zero-order-methods

Last synced: 9 months ago · JSON representation

Repository

Code the ICML 2024 paper: "Variance-reduced Zeroth-Order Methods for Fine-Tuning Language Models"

Basic Info

Host: GitHub
Owner: amazon-science
License: apache-2.0
Language: Python
Default Branch: main
Homepage: https://arxiv.org/abs/2404.08080
Size: 96.7 KB

Statistics

Stars: 11
Watchers: 1
Forks: 0
Open Issues: 2
Releases: 0

Topics

Created almost 2 years ago · Last pushed almost 2 years ago

Metadata Files

Readme Contributing License Code of conduct

README.md

MeZO-SVRG: Variance-Reduced Zero-Order Methods for fine-tuning LLMs

This repository implements the Memory-Efficient Zeroth-Order Stochastic Variance-Reduced Gradient (MeZO-SVRG) algorithm for fine-tuning pre-trained hugging face LMs. As baselines we also implement Memory-efficient ZO Optimizer (MeZO) and first-order SGD (FO-SGD). The repository is written in PyTorch and leverages the Pytorch Lightning framework.

Installation

To install the relevant python environment use the command

bash conda create --name zo_opt python=3.9 conda activate zo_opt python -m pip install -r requirements.txt

File Overview

This repository implements the MeZO-SVRG algorithm and enables fine-tuning on a range on language models using the GLUE benchmark dataset. To run experiments, execute the 'finetune_llm.sh' bash script.

The script supports the following models: 1. 'distilbert-base-cased' 2. 'roberta-large' 3. 'gpt2-xl' 4. 'facebook/opt-2.7b' 5. 'facebook/opt-6.7b'

The script supports the following GLUE tasks: 1. MNLI 2. QNLI 3. SST-2 4. CoLA

Indicate the fine-tuning algorithm by passing one of the following {'FO', 'ZO', 'ZOSVRG'}. The exact hyperparameter settings used to generate the tables/figures in the paper are provided in the Appendix.

Citation

Please consider citing our paper if you use our code: text @misc{gautam2024variancereduced, title={Variance-reduced Zeroth-Order Methods for Fine-Tuning Language Models}, author={Tanmay Gautam and Youngsuk Park and Hao Zhou and Parameswaran Raman and Wooseok Ha}, year={2024}, eprint={2404.08080}, archivePrefix={arXiv}, primaryClass={cs.LG} }

Security

See CONTRIBUTING for more information.

License

This project is licensed under the Apache-2.0 License.

Owner

Name: Amazon Science
Login: amazon-science
Kind: organization

Website: https://amazon.science
Twitter: AmazonScience
Repositories: 80
Profile: https://github.com/amazon-science

GitHub Events

Total

Watch event: 5

Last Year

Watch event: 5

Committers

Last synced: 10 months ago

All Time

Total Commits: 4
Total Committers: 3
Avg Commits per committer: 1.333
Development Distribution Score (DDS): 0.5

Past Year

Commits: 0
Committers: 0
Avg Commits per committer: 0.0
Development Distribution Score (DDS): 0.0

Top Committers

Name	Email	Commits
Parameswaran Raman	p**n@a**m	2
Amazon GitHub Automation	5****o	1
Params Raman	p**r@g**m	1

Committer Domains (Top 20 + Academic)

amazon.com: 1

Issues and Pull Requests

Last synced: 10 months ago

All Time

Total issues: 2
Total pull requests: 0
Average time to close issues: N/A
Average time to close pull requests: N/A
Total issue authors: 1
Total pull request authors: 0
Average comments per issue: 1.0
Average comments per pull request: 0
Merged pull requests: 0
Bot issues: 0
Bot pull requests: 0

Past Year

Issues: 2
Pull requests: 0
Average time to close issues: N/A
Average time to close pull requests: N/A
Issue authors: 1
Pull request authors: 0
Average comments per issue: 1.0
Average comments per pull request: 0
Merged pull requests: 0
Bot issues: 0
Bot pull requests: 0

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Open Source Science

https://github.com/amazon-science/mezo_svrg

Science Score: 26.0%

Keywords

Repository

Basic Info

Statistics

Topics

Metadata Files

README.md

MeZO-SVRG: Variance-Reduced Zero-Order Methods for fine-tuning LLMs

Installation

File Overview

Citation

Security

License

Owner

GitHub Events

Total

Last Year

Committers

All Time

Past Year

Top Committers

Committer Domains (Top 20 + Academic)

Issues and Pull Requests

All Time

Past Year

Top Authors

Issue Authors

Pull Request Authors

Top Labels

Issue Labels

Pull Request Labels