sorsa
SORSA: Singular Values and Orthonormal Regularized Singular Vectors Adaptation of Large Language Models
Science Score: 54.0%
This score indicates how likely this project is to be science-related based on various indicators:
-
✓CITATION.cff file
Found CITATION.cff file -
✓codemeta.json file
Found codemeta.json file -
✓.zenodo.json file
Found .zenodo.json file -
○DOI references
-
✓Academic publication links
Links to: arxiv.org, scholar.google -
○Committers with academic emails
-
○Institutional organization owner
-
○JOSS paper metadata
-
○Scientific vocabulary similarity
Low similarity (12.7%) to scientific vocabulary
Keywords
Repository
SORSA: Singular Values and Orthonormal Regularized Singular Vectors Adaptation of Large Language Models
Basic Info
- Host: GitHub
- Owner: Gunale0926
- License: apache-2.0
- Language: Python
- Default Branch: main
- Homepage: https://arxiv.org/abs/2409.00055
- Size: 4.39 MB
Statistics
- Stars: 40
- Watchers: 1
- Forks: 4
- Open Issues: 0
- Releases: 1
Topics
Metadata Files
README.md
SORSA: Singular Values and Orthonormal Regularized Singular Vectors Adaptation of Large Language Models
Author: Yang Cao
This repository contains the codes of experiments of the paper SORSA: Singular Values and Orthonormal Regularized Singular Vectors Adaptation of Large Language Models.

The rapid advancement in large language models (LLMs) comes with a significant increase in their parameter size, presenting challenges for adaptation and fine-tuning. Parameter-efficient fine-tuning (PEFT) methods are widely used to adapt LLMs for downstream tasks efficiently. In this paper, we propose Singular Values and Orthonormal Regularized Singular Vectors Adaptation, or SORSA, a novel PEFT method. Each SORSA adapter consists of two main parts: trainable principal singular weights $Wp = Up \text{diag}(Sp) V^\topp$, and frozen residual weights $Wr = Ur \text{diag}(Sr) V^\topr$. These parts are initialized by performing SVD on pre-trained weights. Moreover, we implement and analyze an orthonormal regularizer. SORSA adapters could be merged during inference, thus eliminating any inference latency.
Empirical Experiments

Reproduce the Experiments
First, install sorsa package from pip:
bash
pip install sorsa
Then, create .env file in the root directory of the project and add your Hugging Face Access Token:
bash
hf=Your_Hugging_Face_Access_Token
Llama 2 7B, Mistral v0.1 7B and Gemma 7B
First, install the packages via anaconda
bash
conda env create -f environment.yml
Run scripts from ./scripts/train_sorsa.sh to train the model.
After training, run the ./scripts/merge_sorsa.sh to merge the adapter to the base model:
Run following command to evaluate on GSM-8K:
bash
python3 run.py --name llama2_sorsa_r128 \
--test \
--test-dataset gsm-8k \
--test-precision bf16
Run following command to evaluate on MATH:
bash
python3 run.py --name llama2_sorsa_r128 \
--test \
--test-dataset math \
--test-precision bf16
Run following command to evaluate on HumanEval:
bash
python3 run.py --name llama2_sorsa_r128 \
--test \
--test-dataset humaneval \
--test-precision bf16
RWKV6
If you are training, merging or testing RWKV6 model, please add --rwkv flag to run.py.
Cite the work
You could cite the work by using the BibTeX code as follows:
bibtex
@article{cao2024sorsa,
title={SORSA: Singular Values and Orthonormal Regularized Singular Vectors Adaptation of Large Language Models},
author={Cao, Yang},
journal={arXiv preprint arXiv:2409.00055},
year={2024}
}
Owner
- Name: Yang Cao
- Login: Gunale0926
- Kind: user
- Location: Pennsylvania, United States
- Company: Wyoming Seminary
- Website: https://blog.yang-cao.com
- Repositories: 1
- Profile: https://github.com/Gunale0926
A random high school student.
Citation (CITATION.cff)
cff-version: 1.2.0
message: "Please cite our work as below"
authors:
- family-names: "Cao"
given-names: "Yang"
orcid: "https://orcid.org/0009-0006-3236-6770"
title: "SORSA: Singular Values and Orthonormal Regularized Singular Vectors Adaptation of Large Language Models"
date-released: 2024-08-21
url: "https://github.com/Gunale0926/SORSA"
preferred-citation:
type: generic
authors:
- family-names: "Cao"
given-names: "Yang"
orcid: "https://orcid.org/0000-0000-0000-0000"
doi: 10.48550/arXiv.2409.00055
title: "SORSA: Singular Values and Orthonormal Regularized Singular Vectors Adaptation of Large Language Models"
year: 2024
month: 8
GitHub Events
Total
- Issues event: 1
- Watch event: 8
- Delete event: 4
- Issue comment event: 1
- Push event: 11
- Pull request event: 7
- Fork event: 2
- Create event: 4
Last Year
- Issues event: 1
- Watch event: 8
- Delete event: 4
- Issue comment event: 1
- Push event: 11
- Pull request event: 7
- Fork event: 2
- Create event: 4
Issues and Pull Requests
Last synced: 6 months ago
All Time
- Total issues: 1
- Total pull requests: 33
- Average time to close issues: 2 minutes
- Average time to close pull requests: 41 minutes
- Total issue authors: 1
- Total pull request authors: 1
- Average comments per issue: 1.0
- Average comments per pull request: 0.0
- Merged pull requests: 33
- Bot issues: 0
- Bot pull requests: 0
Past Year
- Issues: 1
- Pull requests: 33
- Average time to close issues: 2 minutes
- Average time to close pull requests: 41 minutes
- Issue authors: 1
- Pull request authors: 1
- Average comments per issue: 1.0
- Average comments per pull request: 0.0
- Merged pull requests: 33
- Bot issues: 0
- Bot pull requests: 0
Top Authors
Issue Authors
- Gunale0926 (2)
Pull Request Authors
- Gunale0926 (61)
Top Labels
Issue Labels
Pull Request Labels
Packages
- Total packages: 1
-
Total downloads:
- pypi 11 last-month
- Total dependent packages: 0
- Total dependent repositories: 0
- Total versions: 2
- Total maintainers: 1
pypi.org: sorsa
"SORSA: Singular Values and Orthonormal Regularized Singular Vectors Adaptation of Large Language Models" implementation intergrated with Hugging Face transformers
- Homepage: https://github.com/Gunale0926/SORSA/
- Documentation: https://sorsa.readthedocs.io/
- License: Apache Software License
-
Latest release: 1.0.1
published over 1 year ago
Rankings
Maintainers (1)
Dependencies
- _libgcc_mutex 0.1
- _openmp_mutex 4.5
- absl-py 2.1.0
- aiohttp 3.9.5
- aiosignal 1.3.1
- aom 3.9.1
- appdirs 1.4.4
- arrow 1.2.3
- attrs 23.2.0
- aws-c-auth 0.7.22
- aws-c-cal 0.6.14
- aws-c-common 0.9.19
- aws-c-compression 0.2.18
- aws-c-event-stream 0.4.2
- aws-c-http 0.8.1
- aws-c-io 0.14.8
- aws-c-mqtt 0.10.4
- aws-c-s3 0.5.9
- aws-c-sdkutils 0.1.16
- aws-checksums 0.1.18
- aws-crt-cpp 0.26.9
- aws-sdk-cpp 1.11.329
- azure-core-cpp 1.12.0
- azure-identity-cpp 1.8.0
- azure-storage-blobs-cpp 12.11.0
- azure-storage-common-cpp 12.6.0
- azure-storage-files-datalake-cpp 12.10.0
- binaryornot 0.4.4
- blas 1.0
- blinker 1.6.2
- brotli-python 1.1.0
- bzip2 1.0.8
- c-ares 1.28.1
- ca-certificates 2024.7.2
- cairo 1.16.0
- certifi 2024.7.4
- cffi 1.16.0
- chardet 4.0.0
- charset-normalizer 3.3.2
- click 8.1.7
- colorama 0.4.6
- cookiecutter 2.6.0
- cryptography 42.0.5
- cuda-cudart 12.4.127
- cuda-cupti 12.4.127
- cuda-libraries 12.4.0
- cuda-nvrtc 12.4.127
- cuda-nvtx 12.4.127
- cuda-opencl 12.5.39
- cuda-runtime 12.4.0
- cuda-version 12.5
- datasets 2.20.0
- dav1d 1.2.1
- dill 0.3.8
- docker-pycreds 0.4.0
- evaluate 0.4.1
- expat 2.6.2
- ffmpeg 4.2.2
- filelock 3.15.4
- font-ttf-dejavu-sans-mono 2.37
- font-ttf-inconsolata 2.001
- font-ttf-source-code-pro 2.030
- font-ttf-ubuntu 0.83
- fontconfig 2.14.2
- fonts-anaconda 1
- fonts-conda-ecosystem 1
- freetype 2.12.1
- fribidi 1.0.10
- frozenlist 1.4.1
- fsspec 2024.5.0
- gflags 2.2.2
- gitdb 4.0.7
- gitpython 3.1.37
- glib 2.78.4
- glib-tools 2.78.4
- glog 0.7.1
- gmp 6.3.0
- gnutls 3.6.15
- google-auth-oauthlib 0.4.1
- graphite2 1.3.14
- grpcio 1.62.2
- h2 4.1.0
- harfbuzz 4.3.0
- hpack 4.0.0
- huggingface_hub 0.23.4
- hyperframe 6.0.1
- icu 73.2
- idna 3.7
- importlib-metadata 7.0.1
- intel-openmp 2023.1.0
- jinja2 3.1.4
- joblib 1.4.2
- keyutils 1.6.1
- krb5 1.21.3
- lame 3.100
- lcms2 2.16
- ld_impl_linux-64 2.40
- lerc 4.0.0
- libabseil 20240116.2
- libarrow 16.1.0
- libarrow-acero 16.1.0
- libarrow-dataset 16.1.0
- libarrow-substrait 16.1.0
- libass 0.14.0
- libblas 3.9.0
- libbrotlicommon 1.1.0
- libbrotlidec 1.1.0
- libbrotlienc 1.1.0
- libcblas 3.9.0
- libcrc32c 1.1.2
- libcublas 12.4.2.65
- libcufft 11.2.0.44
- libcufile 1.10.0.4
- libcurand 10.3.6.39
- libcurl 8.8.0
- libcusolver 11.6.0.99
- libcusparse 12.3.0.142
- libdeflate 1.20
- libdrm 2.4.122
- libedit 3.1.20191231
- libev 4.33
- libevent 2.1.12
- libexpat 2.6.2
- libffi 3.4.2
- libgcc-ng 14.1.0
- libgfortran-ng 14.1.0
- libgfortran5 14.1.0
- libglib 2.78.4
- libgomp 14.1.0
- libgoogle-cloud 2.24.0
- libgoogle-cloud-storage 2.24.0
- libgrpc 1.62.2
- libhwloc 2.10.0
- libiconv 1.17
- libidn2 2.3.4
- libjpeg-turbo 3.0.0
- liblapack 3.9.0
- libnghttp2 1.58.0
- libnpp 12.2.5.2
- libnsl 2.0.1
- libnvfatbin 12.5.39
- libnvjitlink 12.4.99
- libnvjpeg 12.3.1.89
- libopenblas 0.3.27
- libopenvino 2024.2.0
- libopenvino-auto-batch-plugin 2024.2.0
- libopenvino-auto-plugin 2024.2.0
- libopenvino-hetero-plugin 2024.2.0
- libopenvino-intel-cpu-plugin 2024.2.0
- libopenvino-intel-gpu-plugin 2024.2.0
- libopenvino-intel-npu-plugin 2024.2.0
- libopenvino-ir-frontend 2024.2.0
- libopenvino-onnx-frontend 2024.2.0
- libopenvino-paddle-frontend 2024.2.0
- libopenvino-pytorch-frontend 2024.2.0
- libopenvino-tensorflow-frontend 2024.2.0
- libopenvino-tensorflow-lite-frontend 2024.2.0
- libopus 1.3.1
- libparquet 16.1.0
- libpciaccess 0.18
- libpng 1.6.43
- libprotobuf 4.25.3
- libre2-11 2023.09.01
- libsqlite 3.46.0
- libssh2 1.11.0
- libstdcxx-ng 14.1.0
- libtasn1 4.19.0
- libthrift 0.19.0
- libtiff 4.6.0
- libunistring 0.9.10
- libutf8proc 2.8.0
- libuuid 2.38.1
- libva 2.21.0
- libvpx 1.7.0
- libwebp-base 1.4.0
- libxcb 1.15
- libxcrypt 4.4.36
- libxml2 2.12.7
- libzlib 1.2.13
- llvm-openmp 15.0.7
- lz4-c 1.9.4
- markdown 3.4.1
- markdown-it-py 2.2.0
- markupsafe 2.1.3
- mdurl 0.1.0
- mkl 2023.1.0
- mkl-service 2.4.0
- mkl_fft 1.3.8
- mkl_random 1.2.4
- mpmath 1.3.0
- multidict 6.0.5
- multiprocess 0.70.16
- ncurses 6.5
- nettle 3.7.3
- networkx 3.2.1
- numpy 1.26.4
- numpy-base 1.26.4
- oauthlib 3.2.2
- ocl-icd 2.3.2
- openh264 2.1.1
- openjpeg 2.5.2
- openssl 3.3.1
- orc 2.0.1
- p11-kit 0.24.1
- packaging 24.1
- pandas 2.2.2
- pathtools 0.1.2
- pcre2 10.42
- pillow 10.3.0
- pip 24.0
- pixman 0.43.2
- protobuf 4.25.3
- psutil 5.9.0
- pthread-stubs 0.3
- pugixml 1.14
- pyarrow 16.1.0
- pyarrow-core 16.1.0
- pyarrow-hotfix 0.6
- pyasn1 0.4.8
- pyasn1-modules 0.2.8
- pybind11-abi 5
- pycparser 2.22
- pygments 2.15.1
- pyjwt 2.8.0
- pyopenssl 24.0.0
- pysocks 1.7.1
- python 3.12.3
- python-dateutil 2.9.0
- python-dotenv 1.0.1
- python-slugify 5.0.2
- python-tzdata 2024.1
- python-xxhash 3.4.1
- python_abi 3.12
- pytorch 2.5.0.dev20240710
- pytorch-cuda 12.4
- pytorch-mutex 1.0
- pytz 2024.1
- pyyaml 6.0.1
- re2 2023.09.01
- readline 8.2
- regex 2024.5.15
- requests 2.32.3
- requests-oauthlib 1.3.0
- responses 0.13.3
- rich 13.3.5
- rsa 4.7.2
- s2n 1.4.15
- safetensors 0.4.3
- scikit-learn 1.4.2
- scipy 1.13.1
- sentry-sdk 1.9.0
- setproctitle 1.2.2
- setuptools 70.1.1
- six 1.16.0
- smmap 4.0.0
- snappy 1.2.0
- svt-av1 2.1.0
- sympy 1.12
- tbb 2021.12.0
- tensorboard 2.6.0
- tensorboard-plugin-wit 1.6.0
- text-unidecode 1.3
- threadpoolctl 3.5.0
- tk 8.6.13
- tokenizers 0.19.1
- torchaudio 2.4.0.dev20240710
- torchtriton 3.0.0+dedb7bdf33
- torchvision 0.20.0.dev20240710
- tqdm 4.66.4
- transformers 4.41.2
- typing-extensions 4.12.2
- typing_extensions 4.12.2
- tzdata 2024a
- unidecode 1.2.0
- urllib3 2.2.2
- wayland 1.23.0
- wayland-protocols 1.36
- werkzeug 3.0.3
- wheel 0.43.0
- x264 1!157.20191217
- x265 3.5
- xorg-fixesproto 5.0
- xorg-kbproto 1.0.7
- xorg-libice 1.1.1
- xorg-libsm 1.2.4
- xorg-libx11 1.8.9
- xorg-libxau 1.0.11
- xorg-libxdmcp 1.1.3
- xorg-libxext 1.3.4
- xorg-libxfixes 5.0.3
- xorg-libxrender 0.9.11
- xorg-renderproto 0.11.1
- xorg-xextproto 7.3.0
- xorg-xproto 7.0.31
- xxhash 0.8.2
- xz 5.2.6
- yaml 0.2.5
- yarl 1.9.4
- zipp 3.17.0
- zlib 1.2.13
- zstandard 0.22.0
- zstd 1.5.6