https://github.com/amazon-science/mezo_svrg
Code the ICML 2024 paper: "Variance-reduced Zeroth-Order Methods for Fine-Tuning Language Models"
Science Score: 26.0%
This score indicates how likely this project is to be science-related based on various indicators:
-
○CITATION.cff file
-
✓codemeta.json file
Found codemeta.json file -
✓.zenodo.json file
Found .zenodo.json file -
○DOI references
-
○Academic publication links
-
○Committers with academic emails
-
○Institutional organization owner
-
○JOSS paper metadata
-
○Scientific vocabulary similarity
Low similarity (12.5%) to scientific vocabulary
Keywords
Repository
Code the ICML 2024 paper: "Variance-reduced Zeroth-Order Methods for Fine-Tuning Language Models"
Basic Info
- Host: GitHub
- Owner: amazon-science
- License: apache-2.0
- Language: Python
- Default Branch: main
- Homepage: https://arxiv.org/abs/2404.08080
- Size: 96.7 KB
Statistics
- Stars: 11
- Watchers: 1
- Forks: 0
- Open Issues: 2
- Releases: 0
Topics
Metadata Files
README.md
MeZO-SVRG: Variance-Reduced Zero-Order Methods for fine-tuning LLMs
This repository implements the Memory-Efficient Zeroth-Order Stochastic Variance-Reduced Gradient (MeZO-SVRG) algorithm for fine-tuning pre-trained hugging face LMs. As baselines we also implement Memory-efficient ZO Optimizer (MeZO) and first-order SGD (FO-SGD). The repository is written in PyTorch and leverages the Pytorch Lightning framework.
Installation
To install the relevant python environment use the command
bash
conda create --name zo_opt python=3.9
conda activate zo_opt
python -m pip install -r requirements.txt
File Overview
This repository implements the MeZO-SVRG algorithm and enables fine-tuning on a range on language models using the GLUE benchmark dataset. To run experiments, execute the 'finetune_llm.sh' bash script.
The script supports the following models: 1. 'distilbert-base-cased' 2. 'roberta-large' 3. 'gpt2-xl' 4. 'facebook/opt-2.7b' 5. 'facebook/opt-6.7b'
The script supports the following GLUE tasks: 1. MNLI 2. QNLI 3. SST-2 4. CoLA
Indicate the fine-tuning algorithm by passing one of the following {'FO', 'ZO', 'ZOSVRG'}. The exact hyperparameter settings used to generate the tables/figures in the paper are provided in the Appendix.
Citation
Please consider citing our paper if you use our code:
text
@misc{gautam2024variancereduced,
title={Variance-reduced Zeroth-Order Methods for Fine-Tuning Language Models},
author={Tanmay Gautam and Youngsuk Park and Hao Zhou and Parameswaran Raman and Wooseok Ha},
year={2024},
eprint={2404.08080},
archivePrefix={arXiv},
primaryClass={cs.LG}
}
Security
See CONTRIBUTING for more information.
License
This project is licensed under the Apache-2.0 License.
Owner
- Name: Amazon Science
- Login: amazon-science
- Kind: organization
- Website: https://amazon.science
- Twitter: AmazonScience
- Repositories: 80
- Profile: https://github.com/amazon-science
GitHub Events
Total
- Watch event: 5
Last Year
- Watch event: 5
Committers
Last synced: 7 months ago
Top Committers
| Name | Commits | |
|---|---|---|
| Parameswaran Raman | p****n@a****m | 2 |
| Amazon GitHub Automation | 5****o | 1 |
| Params Raman | p****r@g****m | 1 |
Committer Domains (Top 20 + Academic)
Issues and Pull Requests
Last synced: 7 months ago
All Time
- Total issues: 2
- Total pull requests: 0
- Average time to close issues: N/A
- Average time to close pull requests: N/A
- Total issue authors: 1
- Total pull request authors: 0
- Average comments per issue: 1.0
- Average comments per pull request: 0
- Merged pull requests: 0
- Bot issues: 0
- Bot pull requests: 0
Past Year
- Issues: 2
- Pull requests: 0
- Average time to close issues: N/A
- Average time to close pull requests: N/A
- Issue authors: 1
- Pull request authors: 0
- Average comments per issue: 1.0
- Average comments per pull request: 0
- Merged pull requests: 0
- Bot issues: 0
- Bot pull requests: 0
Top Authors
Issue Authors
- stan-anony (2)