lotr

Low Tensor Rank adaptation of large language models

https://github.com/daskol/lotr

Science Score: 54.0%

This score indicates how likely this project is to be science-related based on various indicators:

✓
CITATION.cff file
Found CITATION.cff file
✓
codemeta.json file
Found codemeta.json file
✓
.zenodo.json file
Found .zenodo.json file
○
DOI references
✓
Academic publication links
Links to: arxiv.org
○
Committers with academic emails
○
Institutional organization owner
○
JOSS paper metadata
○
Scientific vocabulary similarity
Low similarity (13.5%) to scientific vocabulary

Keywords

fine-tuning llm lora lotr parameter-efficient-tuning peft

Last synced: 6 months ago · JSON representation ·

Repository

Low Tensor Rank adaptation of large language models

Basic Info

Host: GitHub
Owner: daskol
License: apache-2.0
Language: Python
Default Branch: main
Homepage: https://arxiv.org/abs/2402.01376
Size: 132 KB

Statistics

Stars: 9
Watchers: 1
Forks: 1
Open Issues: 1
Releases: 1

Topics

fine-tuning llm lora lotr parameter-efficient-tuning peft

Created about 2 years ago · Last pushed over 1 year ago

Metadata Files

Readme License Citation

LoTR: Low Tensor Rank Adaptation of Large Language Models

Low Tensor Rank adaptation of large language models

Overview

This repository is the original implementation of LoTR (arXiv:2402.01376), a novel approach for parameter-efficient fine-tuning of LLMs which represents a gradient update to parameters in a form of tensor decomposition. Low-rank adapter for each layer is constructed as a product of three matrices, and tensor structure arises from sharing left and right multipliers of this product among layers. Simultaneous compression of a sequence of layers with low-rank tensor representation allows LoTR to archive even better parameter efficiency then LoRA especially for deep models. Moreover, the core tensor does not depend on original weight dimension and can be made arbitrary small, which allows for extremely cheap and fast downstream fine-tuning.

bibtex @misc{bershatsky2024lotr, title = {{LoTR}: Low Tensor Rank Weight Adaptation}, author = {Daniel Bershatsky and Daria Cherniuk and Talgat Daulbaev and Aleksandr Mikhalev and Ivan Oseledets}, year = {2024}, eprint = {2402.01376}, archivePrefix = {arXiv}, primaryClass = {cs.CL} }

Experiments

Logging Files

We assume that all raw experiment results (i.e. logging files, first of all) are located in log directory. This directory's high-level structure should reflect experimental setup. So the path relative to this directory should have structure as follows.

<dataset>/<model>/<method>/<param1>/<param2>/.../<seed>/<tfevents-file>

The model segment preceeds the method path segment since number of differnt models usually are smaller that number of methods and training pipeline usually parameterized by model and then by method. All floating point parameters should be used in scientific notation to ensure that no significant digits are lost. The lat directory is random seed used to run an experiment.

Note that the requirements above are involuntary since there is no full-featured machine learning experiment management software.

Convertion to Arrow Parquet

TensorBoard tfvents-file are quite large files which take noticably long time to read and load. So we convert tfevents-files to parquet-files with the following command.

shell python -m lotr.tb2parquet log/glue data/glue.parquet \ --names model method task lr rank seed \

Now, one can read a single parquet-file with all time series as follows.

python import pandas as pd df = pd.read_parquet('data/glue.parquet')

To be more specific, 20Mb of tfevents-file are converted to 200Kb of parquet-file.

Owner

Name: Daniel Bershatsky
Login: daskol
Kind: user
Location: Russia, Moscow
Company: @skoltech-ai

Website: https://daskol.xyz
Repositories: 11
Profile: https://github.com/daskol

Citation (CITATION.cff)

cff-version: 1.2.0
message: "If you use this software, please cite our as below."
authors:
- family-names: "Bershatsky"
  given-names: "Daniel"
  orcid: "https://orcid.org/0000-0001-8917-8187"
- family-names: "Cherniuk"
  given-names: "Daria"
  orcid: "https://orcid.org/0000-0000-0000-0000"
- family-names: "Daulbaev"
  given-names: "Talgat"
  orcid: "https://orcid.org/0009-0000-3364-8979"
- family-names: "Mikhalev"
  given-names: "Aleksandr"
  orcid: "https://orcid.org/0000-0002-9274-7237"
- family-names: "Oseledets"
  given-names: "Ivan"
  orcid: "https://orcid.org/0000-0003-2071-2163"
title: "LoTR: Low Tensor Rank Weight Adaptation"
version: 0.1.0
date-released: 2024-02-02
url: "https://github.com/skolai/lotr"
preferred-citation:
  type: generic
  status: submitted
  title: "LoTR: Low Tensor Rank Weight Adaptation"
  authors:
  - family-names: "Bershatsky"
    given-names: "Daniel"
    orcid: "https://orcid.org/0000-0001-8917-8187"
  - family-names: "Cherniuk"
    given-names: "Daria"
    orcid: "https://orcid.org/0000-0000-0000-0000"
  - family-names: "Daulbaev"
    given-names: "Talgat"
    orcid: "https://orcid.org/0009-0000-3364-8979"
  - family-names: "Mikhalev"
    given-names: "Aleksandr"
    orcid: "https://orcid.org/0000-0002-9274-7237"
  - family-names: "Oseledets"
    given-names: "Ivan"
    orcid: "https://orcid.org/0000-0003-2071-2163"
  doi: 10.48550/arXiv.2402.01376
  date-published: 2024-02-02
  identifiers:
  - type: other
    value: "2402.01376"
    description: The ArXiv preprint of the paper

GitHub Events

Total

Watch event: 4

Last Year

Watch event: 4

Committers

Last synced: over 1 year ago

All Time

Total Commits: 54
Total Committers: 3
Avg Commits per committer: 18.0
Development Distribution Score (DDS): 0.074

Past Year

Commits: 11
Committers: 2
Avg Commits per committer: 5.5
Development Distribution Score (DDS): 0.273

Top Committers

Name	Email	Commits
Daniel Bershatsky	d**2@s**u	50
Daniel Bershatsky	d**y@g**m	3
Daria Cherniuk	d**k@s**u	1

Committer Domains (Top 20 + Academic)

skoltech.ru: 2

Issues and Pull Requests

Last synced: over 1 year ago

All Time

Total issues: 0
Total pull requests: 1
Average time to close issues: N/A
Average time to close pull requests: less than a minute
Total issue authors: 0
Total pull request authors: 1
Average comments per issue: 0
Average comments per pull request: 0.0
Merged pull requests: 1
Bot issues: 0
Bot pull requests: 0

Past Year

Issues: 0
Pull requests: 1
Average time to close issues: N/A
Average time to close pull requests: less than a minute
Issue authors: 0
Pull request authors: 1
Average comments per issue: 0
Average comments per pull request: 0.0
Merged pull requests: 1
Bot issues: 0
Bot pull requests: 0

View more stats

Top Authors

Issue Authors

dmikushin (1)

Pull Request Authors

daskol (2)

Top Labels

Issue Labels

Pull Request Labels

Dependencies

Dockerfile docker

nvcr.io/nvidia/pytorch 23.05-py3 build

sandbox/postgres/docker-compose.yml docker

postgres 15

sandbox/tensorboard/Dockerfile docker

ubuntu 20.04 build

sandbox/tensorboard/docker-compose.yml docker

tensorboard latest

pyproject.toml pypi

torch *
typing-extensions python_version<'3.11'

lotr

Science Score: 54.0%

Keywords

Repository

Basic Info

Statistics

Topics

Metadata Files

README.md

LoTR: Low Tensor Rank Adaptation of Large Language Models

Overview

Experiments

Logging Files

Convertion to Arrow Parquet

Owner

Citation (CITATION.cff)

GitHub Events

Total

Last Year

Committers

All Time

Past Year

Top Committers

Committer Domains (Top 20 + Academic)

Issues and Pull Requests

All Time

Past Year

Top Authors

Issue Authors

Pull Request Authors

Top Labels

Issue Labels

Pull Request Labels

Dependencies