relora
Official code for ReLoRA from the paper Stack More Layers Differently: High-Rank Training Through Low-Rank Updates
Science Score: 26.0%
This score indicates how likely this project is to be science-related based on various indicators:
-
○CITATION.cff file
-
✓codemeta.json file
Found codemeta.json file -
✓.zenodo.json file
Found .zenodo.json file -
○DOI references
-
○Academic publication links
-
○Committers with academic emails
-
○Institutional organization owner
-
○JOSS paper metadata
-
○Scientific vocabulary similarity
Low similarity (1.5%) to scientific vocabulary
Keywords
Repository
Official code for ReLoRA from the paper Stack More Layers Differently: High-Rank Training Through Low-Rank Updates
Basic Info
- Host: GitHub
- Owner: Guitaricet
- License: apache-2.0
- Language: Jupyter Notebook
- Default Branch: main
- Homepage: https://arxiv.org/abs/2307.05695
- Size: 1.89 MB
Statistics
- Stars: 462
- Watchers: 9
- Forks: 40
- Open Issues: 5
- Releases: 0
Topics
Metadata Files
README.dev.md
Some script to check that the most common training reigmes work.
``` torchrun --nproc-per-node 2 torchrunmain.py \ --datasetpath preprocesseddata/wikitextwikitext-2-v1EleutherAIpythia-1.4b512 \ --modelnameorpath EleutherAI/pythia-1.4b \ --usepeft \ --relora 10 \ --modelrevision step1000 \ --batchsize 4 \ --totalbatchsize 96 \ --lr 5e-4 \ --maxlength 512 \ --evalevery 20 \ --saveevery 20 \ --numtrainingsteps 40 \ --distributedtype ddp \ --optimizer adamzero \ --tags debug
torchrun --nproc-per-node 2 torchrunmain.py \ --datasetpath preprocesseddata/wikitextwikitext-2-v1EleutherAIpythia-1.4b512 \ --modelnameorpath EleutherAI/pythia-1.4b \ --modelrevision step1000 \ --batchsize 6 \ --totalbatchsize 96 \ --lr 5e-4 \ --maxlength 512 \ --evalevery 2 \ --saveevery 10 \ --numtrainingsteps 20 \ --distributedtype ddp \ --tags debug,fsdp_debug
torchrun --nproc-per-node 2 torchrunmain.py \ --datasetpath preprocesseddata/wikitextwikitext-2-v1t5-base512 \ --modelconfig configs/llama250m.json \ --batchsize 24 \ --totalbatchsize 96 \ --lr 5e-4 \ --maxlength 512 \ --evalevery 2 \ --saveevery 10 \ --numtrainingsteps 20 \ --distributedtype ddp \ --tags debug,fsdpdebug
torchrun --nproc-per-node 2 torchrunmain.py \ --datasetpath preprocesseddata/wikitextwikitext-2-v1t5-base512 \ --modelconfig configs/llama250m.json \ --batchsize 24 \ --totalbatchsize 96 \ --lr 5e-4 \ --maxlength 512 \ --evalevery 2 \ --saveevery 10 \ --numtrainingsteps 20 \ --distributedtype fsdp \ --tags debug,fsdpdebug
torchrun --nproc-per-node 2 torchrunmain.py \ --datasetpath preprocesseddata/wikitextwikitext-2-v1gpt2512 \ --modelconfig configs/llama250m50K.json \ --batchsize 24 \ --totalbatchsize 96 \ --lr 5e-4 \ --maxlength 512 \ --evalevery 2 \ --saveevery 10 \ --numtrainingsteps 20 \ --distributedtype ddp \ --dtype float32 \ --tags debug,fsdp_debug
torchrun --nproc-per-node 2 torchrunmain.py \ --modelconfig configs/llama250m.json \ --batchsize 24 \ --totalbatchsize 96 \ --lr 5e-4 \ --maxlength 512 \ --evalevery 2 \ --saveevery 10 \ --numtrainingsteps 20000 \ --distributedtype fsdp \ --tags debug,fsdp_debug
torchrun --nproc-per-node 2 torchrunmain.py \ --modelconfig configs/llama250m.json \ --batchsize 24 \ --totalbatchsize 96 \ --lr 5e-4 \ --maxlength 512 \ --evalevery 2 \ --saveevery 10 \ --numtrainingsteps 20000 \ --distributedtype fsdp \ --tags debug,fsdp_debug
torchrun --nproc-per-node 2 torchrunmain.py \ --modelconfig configs/llama250m.json \ --batchsize 24 \ --totalbatchsize 96 \ --lr 1e-3 \ --maxlength 512 \ --usepeft \ --relora 10 \ --cyclelength 10 \ --restartwarmupsteps 5 \ --scheduler cosinerestarts \ --warmupsteps 5 \ --resetoptimizeronrelora False \ --optimizermagnitudepruning 0.9 \ --numtrainingsteps 20000 \ --saveevery 5000 \ --evalevery 5000 \ --warmedupmodel checkpoints/llama250m-2023-06-09-11-29-56/model5000 \ --distributedtype fsdp \ --tags debug,fsdpdebug
torchrun --nproc-per-node 2 torchrunmain.py \ --modelconfig configs/llama250m.json \ --batchsize 24 \ --totalbatchsize 96 \ --lr 1e-3 \ --maxlength 512 \ --usepeft \ --relora 10 \ --cyclelength 10 \ --restartwarmupsteps 5 \ --scheduler cosinerestarts \ --warmupsteps 5 \ --resetoptimizeronrelora False \ --optimizermagnitudepruning 0.9 \ --numtrainingsteps 20000 \ --saveevery 5000 \ --evalevery 5000 \ --warmedupmodel checkpoints/llama250m-2023-06-09-11-29-56/model5000 \ --distributedtype fsdp \ --tags debug,fsdpdebug
```
Owner
- Name: Vlad Lialin
- Login: Guitaricet
- Kind: user
- Location: San Francisco, CA
- Company: @1x-technologies
- Repositories: 75
- Profile: https://github.com/Guitaricet
Deep Learning for Robotics @ 1X Technologies
GitHub Events
Total
- Issues event: 1
- Watch event: 29
- Issue comment event: 6
- Fork event: 4
Last Year
- Issues event: 1
- Watch event: 29
- Issue comment event: 6
- Fork event: 4
Committers
Last synced: 6 months ago
Top Committers
| Name | Commits | |
|---|---|---|
| Vladislav Liain | g****t@g****m | 214 |
Issues and Pull Requests
Last synced: 6 months ago
All Time
- Total issues: 17
- Total pull requests: 1
- Average time to close issues: 12 days
- Average time to close pull requests: N/A
- Total issue authors: 17
- Total pull request authors: 1
- Average comments per issue: 1.59
- Average comments per pull request: 0.0
- Merged pull requests: 0
- Bot issues: 0
- Bot pull requests: 0
Past Year
- Issues: 1
- Pull requests: 0
- Average time to close issues: N/A
- Average time to close pull requests: N/A
- Issue authors: 1
- Pull request authors: 0
- Average comments per issue: 4.0
- Average comments per pull request: 0
- Merged pull requests: 0
- Bot issues: 0
- Bot pull requests: 0
Top Authors
Issue Authors
- ScottishFold007 (1)
- omri123 (1)
- JinYujie99 (1)
- tiendung (1)
- wanghao-007 (1)
- thistleknot (1)
- vorobyov01 (1)
- ElleLeonne (1)
- itongggg (1)
- DaehanKim (1)
- haofanwang (1)
- henbucuoshanghai (1)
- datalee (1)
- mooncui (1)
- skykiseki (1)
Pull Request Authors
- Guitaricet (2)
Top Labels
Issue Labels
Pull Request Labels
Dependencies
- datasets *
- lion-pytorch *
- loguru *
- matplotlib *
- nvitop *
- peft *
- tokenizers *
- torch *
- transformers *
- wandb *