https://github.com/amazon-science/mezo_svrg

Code the ICML 2024 paper: "Variance-reduced Zeroth-Order Methods for Fine-Tuning Language Models"

https://github.com/amazon-science/mezo_svrg

Science Score: 26.0%

This score indicates how likely this project is to be science-related based on various indicators:

  • CITATION.cff file
  • codemeta.json file
    Found codemeta.json file
  • .zenodo.json file
    Found .zenodo.json file
  • DOI references
  • Academic publication links
  • Committers with academic emails
  • Institutional organization owner
  • JOSS paper metadata
  • Scientific vocabulary similarity
    Low similarity (12.5%) to scientific vocabulary

Keywords

deep-learning fine-tuning language-model large-language-models llm-training llms machine-learning machine-learning-algorithms optimization optimization-algorithms svrg variance-reduction zero-order-methods
Last synced: 5 months ago · JSON representation

Repository

Code the ICML 2024 paper: "Variance-reduced Zeroth-Order Methods for Fine-Tuning Language Models"

Basic Info
Statistics
  • Stars: 11
  • Watchers: 1
  • Forks: 0
  • Open Issues: 2
  • Releases: 0
Topics
deep-learning fine-tuning language-model large-language-models llm-training llms machine-learning machine-learning-algorithms optimization optimization-algorithms svrg variance-reduction zero-order-methods
Created over 1 year ago · Last pushed over 1 year ago
Metadata Files
Readme Contributing License Code of conduct

README.md

MeZO-SVRG: Variance-Reduced Zero-Order Methods for fine-tuning LLMs

This repository implements the Memory-Efficient Zeroth-Order Stochastic Variance-Reduced Gradient (MeZO-SVRG) algorithm for fine-tuning pre-trained hugging face LMs. As baselines we also implement Memory-efficient ZO Optimizer (MeZO) and first-order SGD (FO-SGD). The repository is written in PyTorch and leverages the Pytorch Lightning framework.

Installation

To install the relevant python environment use the command

bash conda create --name zo_opt python=3.9 conda activate zo_opt python -m pip install -r requirements.txt

File Overview

This repository implements the MeZO-SVRG algorithm and enables fine-tuning on a range on language models using the GLUE benchmark dataset. To run experiments, execute the 'finetune_llm.sh' bash script.

The script supports the following models: 1. 'distilbert-base-cased' 2. 'roberta-large' 3. 'gpt2-xl' 4. 'facebook/opt-2.7b' 5. 'facebook/opt-6.7b'

The script supports the following GLUE tasks: 1. MNLI 2. QNLI 3. SST-2 4. CoLA

Indicate the fine-tuning algorithm by passing one of the following {'FO', 'ZO', 'ZOSVRG'}. The exact hyperparameter settings used to generate the tables/figures in the paper are provided in the Appendix.

Citation

Please consider citing our paper if you use our code: text @misc{gautam2024variancereduced, title={Variance-reduced Zeroth-Order Methods for Fine-Tuning Language Models}, author={Tanmay Gautam and Youngsuk Park and Hao Zhou and Parameswaran Raman and Wooseok Ha}, year={2024}, eprint={2404.08080}, archivePrefix={arXiv}, primaryClass={cs.LG} }

Security

See CONTRIBUTING for more information.

License

This project is licensed under the Apache-2.0 License.

Owner

  • Name: Amazon Science
  • Login: amazon-science
  • Kind: organization

GitHub Events

Total
  • Watch event: 5
Last Year
  • Watch event: 5

Committers

Last synced: 7 months ago

All Time
  • Total Commits: 4
  • Total Committers: 3
  • Avg Commits per committer: 1.333
  • Development Distribution Score (DDS): 0.5
Past Year
  • Commits: 0
  • Committers: 0
  • Avg Commits per committer: 0.0
  • Development Distribution Score (DDS): 0.0
Top Committers
Name Email Commits
Parameswaran Raman p****n@a****m 2
Amazon GitHub Automation 5****o 1
Params Raman p****r@g****m 1
Committer Domains (Top 20 + Academic)

Issues and Pull Requests

Last synced: 7 months ago

All Time
  • Total issues: 2
  • Total pull requests: 0
  • Average time to close issues: N/A
  • Average time to close pull requests: N/A
  • Total issue authors: 1
  • Total pull request authors: 0
  • Average comments per issue: 1.0
  • Average comments per pull request: 0
  • Merged pull requests: 0
  • Bot issues: 0
  • Bot pull requests: 0
Past Year
  • Issues: 2
  • Pull requests: 0
  • Average time to close issues: N/A
  • Average time to close pull requests: N/A
  • Issue authors: 1
  • Pull request authors: 0
  • Average comments per issue: 1.0
  • Average comments per pull request: 0
  • Merged pull requests: 0
  • Bot issues: 0
  • Bot pull requests: 0
Top Authors
Issue Authors
  • stan-anony (2)
Pull Request Authors
Top Labels
Issue Labels
Pull Request Labels