https://github.com/aiot-mlsys-lab/svd-llm

Official Code for "SVD-LLM: Truncation-aware Singular Value Decomposition for Large Language Model Compression"

https://github.com/aiot-mlsys-lab/svd-llm

Science Score: 23.0%

This score indicates how likely this project is to be science-related based on various indicators:

  • CITATION.cff file
  • codemeta.json file
    Found codemeta.json file
  • .zenodo.json file
  • DOI references
  • Academic publication links
    Links to: arxiv.org
  • Academic email domains
  • Institutional organization owner
  • JOSS paper metadata
  • Scientific vocabulary similarity
    Low similarity (11.7%) to scientific vocabulary

Keywords

efficient-models generative-ai large-language-models model-compression
Last synced: 5 months ago · JSON representation

Repository

Official Code for "SVD-LLM: Truncation-aware Singular Value Decomposition for Large Language Model Compression"

Basic Info
Statistics
  • Stars: 135
  • Watchers: 6
  • Forks: 10
  • Open Issues: 8
  • Releases: 0
Topics
efficient-models generative-ai large-language-models model-compression
Created almost 2 years ago · Last pushed 11 months ago
Metadata Files
Readme License

README.md


SVD-LLM: Singular Value Decomposition for Large Language Model Compression

Introduction

SVD-LLM: Truncation-aware Singular Value Decomposition for Large Language Model Compression

Xin Wang1, Yu Zheng2, Zhongwei Wan1, Mi Zhang1
1The Ohio State University, 2Michigan State University

International Conference on Learning Representations (ICLR) 2025

SVD-LLM V2: Optimizing Singular Value Truncation for Large Language Model Compression

Xin Wang, Samiul Alam, Zhongwei Wan, Hui Shen, Mi Zhang
The Ohio State University

Annual Conference of the Nations of the Americas Chapter of the Association for Computational Linguistics (NAACL) 2025

Quick Start

Installation

Please keep the version of the transformers package exactly equal to 4.35.2 since the svd-compressed version of LLM has a slight change of model structure (in the component/. folder). Create and set up a conda environment with python version 3.9 (newer versions break some dependencies) conda create -n compress python=3.9 conda activate compress Clone and navigate to the repository git clone https://github.com/AIoT-MLSys-Lab/SVD-LLM.git Install requirements.txt pip install -r requirements.txt

Quick Example

bash compress_llama.sh This script would compress the LLaMA-7B model under 20\% compression ratio and automatically run the evaluation code, including both perplexity and efficiency of the compressed model.

Step-by-Step Instructions of SVD-LLM

1. Truncation-Aware Data Whitening + SVD Compression

Under the low compression ratio (recommended ratio <= 0.3), we first run the data whitening of the LLM and saved the weight along with the whitening information. python SVDLLM.py \ --step 1 \ --ratio COMPRESSION_RATIO \ --model HUGGINGFACE_MODEL_REPO \ --whitening_nsamples WHITENING_SAMPLE_NUMBER \ --dataset WHITENING_DATASET \ --seed SAMPLING_SEED \ --model_seq_len MODEL_SEQ_LEN \ --save_path WHITENING_INFO_SAVING_PATH

2. Parameter Update with Sequential Low-rank Approximation

We first update the compressed weight matrix U and then V with LoRA fine-tuning. python LoRA.py \ --prune_model COMPRESSED_MODEL_PATH \ --data_path yahma/alpaca-cleaned \ --output_dir LORA_OUTPUT_PATH \ --lora_r 8 \ --num_epochs 2 \ --learning_rate 1e-4 \ --batch_size 64

3. SVD-LLM + GPTQ

SVD-LLM can also be integrated with quantization methods to achieve a better compression. Here is the example of how to integrate SVD-LLM (20% compression ratio) with GPTQ-4bit to compress LLaMA-7B bash svdllm_gptq.sh

4. Evaluation

  • Perplexity Evaluation: python SVDLLM.py \ --step 4 \ --model_path COMPRESSD_MODEL_SAVING_PATH \ We use the same c4 dataset as in SparseGPT. Since the original dowload link is invalid, please directly download it from this link and add the two json files under the utils/. folder.
  • Efficiency Evaluation: python SVDLLM.py \ --step 5 \ --model_path COMPRESSD_MODEL_SAVING_PATH \ ## Citation If you find this work useful, please cite @inproceedings{wang2025svdllm, title={{SVD}-{LLM}: Truncation-aware Singular Value Decomposition for Large Language Model Compression}, author={Xin Wang and Yu Zheng and Zhongwei Wan and Mi Zhang}, booktitle={International Conference on Learning Representations (ICLR)}, year={2025}, url={https://openreview.net/forum?id=LNYIUouhdt} }

Owner

  • Name: OSU AIoT-MLSys Lab
  • Login: AIoT-MLSys-Lab
  • Kind: organization
  • Location: United States of America

GitHub Events

Total
  • Issues event: 29
  • Watch event: 137
  • Issue comment event: 32
  • Push event: 16
  • Pull request event: 1
  • Fork event: 17
Last Year
  • Issues event: 29
  • Watch event: 137
  • Issue comment event: 32
  • Push event: 16
  • Pull request event: 1
  • Fork event: 17

Issues and Pull Requests

Last synced: 6 months ago

All Time
  • Total issues: 17
  • Total pull requests: 1
  • Average time to close issues: N/A
  • Average time to close pull requests: about 1 month
  • Total issue authors: 15
  • Total pull request authors: 1
  • Average comments per issue: 0.0
  • Average comments per pull request: 0.0
  • Merged pull requests: 1
  • Bot issues: 0
  • Bot pull requests: 0
Past Year
  • Issues: 17
  • Pull requests: 1
  • Average time to close issues: N/A
  • Average time to close pull requests: about 1 month
  • Issue authors: 15
  • Pull request authors: 1
  • Average comments per issue: 0.0
  • Average comments per pull request: 0.0
  • Merged pull requests: 1
  • Bot issues: 0
  • Bot pull requests: 0
Top Authors
Issue Authors
  • hsb1995 (4)
  • xiaxin1998 (3)
  • aswanthkrishna (3)
  • JeffreyWong20 (2)
  • codeit1792 (2)
  • 33answer33 (2)
  • jzzzf (2)
  • abcbdf (2)
  • NielsRogge (1)
  • Ambitious4 (1)
  • Ppaddington (1)
  • pvti (1)
  • pvtien96 (1)
  • NamburiSrinath (1)
  • choprahetarth (1)
Pull Request Authors
  • pvti (1)
  • kar-m (1)
  • tuidan (1)
Top Labels
Issue Labels
Pull Request Labels

Dependencies

requirements.txt pypi
  • accelerate *
  • datasets ==2.16.1
  • evaluate *
  • matplotlib ==3.4.3
  • numpy ==1.26.3
  • sentencepiece ==0.1.99
  • torch >=2.0.1
  • tqdm ==4.65.0
  • transformers ==4.35.2