https://github.com/beomi/bitnet-transformers

0️⃣1️⃣🤗 BitNet-Transformers: Huggingface Transformers Implementation of "BitNet: Scaling 1-bit Transformers for Large Language Models" in pytorch with Llama(2) Architecture

Science Score: 23.0%

This score indicates how likely this project is to be science-related based on various indicators:

○
CITATION.cff file
✓
codemeta.json file
Found codemeta.json file
○
.zenodo.json file
○
DOI references
✓
Academic publication links
Links to: arxiv.org
○
Committers with academic emails
○
Institutional organization owner
○
JOSS paper metadata
○
Scientific vocabulary similarity
Low similarity (7.9%) to scientific vocabulary

Keywords

llm quantization quantization-aware-training transformers

Last synced: 9 months ago · JSON representation

Repository

0️⃣1️⃣🤗 BitNet-Transformers: Huggingface Transformers Implementation of "BitNet: Scaling 1-bit Transformers for Large Language Models" in pytorch with Llama(2) Architecture

Basic Info

Host: GitHub
Owner: Beomi
Language: Python
Default Branch: main
Homepage:
Size: 588 KB

Statistics

Stars: 302
Watchers: 9
Forks: 32
Open Issues: 8
Releases: 0

Topics

llm quantization quantization-aware-training transformers

Created over 2 years ago · Last pushed about 2 years ago

Metadata Files

Readme

README.md

0️⃣1️⃣🤗 BitNet-Transformers: Huggingface Transformers Implementation of "BitNet: Scaling 1-bit Transformers for Large Language Models" in pytorch with Llama(2) Architecture

BitNet Architecture

BitNet

Paper Link: https://arxiv.org/pdf/2310.11453.pdf

Prepare Dev env

```bash

Clone this repo

git clone https://github.com/beomi/bitnet-transformers cd bitnet-transformers

Install requirements

pip install -r clm_requirements.txt

Clone transformers repo

git clone https://github.com/huggingface/transformers pip install -e transformers

Update Llama(2) model

rm ./transformers/src/transformers/models/llama/modelingllama.py ln -s $(pwd)/bitnetllama/modelingllama.py ./transformers/src/transformers/models/llama/modelingllama.py ```

We'll overwrite bitnet_llama/modeling_llama.py into transformers. Since the file is linked, any changes made to the file will be reflected in the transformers repo.

Train Wikitext-103

Train Loss Graph when train BitLLAMA using Wikitext-103

You can track metrics via wandb

bash ./train_wikitext.sh

GPU Mem Usage Comparison

Train Config

Batch size: 1
Gradient accumulation: 1
Seq length: 2048
Model: LLamaForCausalLM with BitLinear layer
Model size: 47,452,672 (47.5M)

Original LLAMA - 16bit

Uses 250MB GPU memory for Model weights

BitLLAMA - Mixed 16bit

Uses 200MB GPU memory for Model weights
Use bf16(or fp16) to store model weights
Use int8 to store -1/1 1-bit weights
Use more memory when training than original LLAMA: It saves 1-bit weight and 16bit weight together

BitLLAMA - 8bit

Uses 100MB GPU memory for Model weights
Use bf16(or fp16) on-the-fly when needed
Use 8bit to save 1-bit BitLinear weight & other weights

BitLLAMA - 1bit

Use bf16(or fp16) on-the-fly when needed
Use 1bit to save 1-bit weight

bash TBD

Todo

[x] Add BitLinear layer
[x] Add LLamaForCausalLM model with BitLinear layer
- [x] Update .save_pretrained method (for 1-bit weight saving)
[x] Add sample code for LM training
[ ] Update BitLinear layer to use 1-bit weight
- [ ] Use uint8 instead of bfloat16
- [ ] Use custom cuda kernel for 1-bit weight

GitHub Events

Total

Watch event: 45
Issue comment event: 1
Fork event: 4

Last Year

Watch event: 45
Issue comment event: 1
Fork event: 4

Committers

Last synced: about 1 year ago

All Time

Total Commits: 18
Total Committers: 1
Avg Commits per committer: 18.0
Development Distribution Score (DDS): 0.0

Past Year

Commits: 0
Committers: 0
Avg Commits per committer: 0.0
Development Distribution Score (DDS): 0.0

Top Committers

Name	Email	Commits
junbum lee	j**n@b**t	18

Committer Domains (Top 20 + Academic)

beomi.net: 1

Issues and Pull Requests

Last synced: about 1 year ago

All Time

Total issues: 11
Total pull requests: 2
Average time to close issues: 3 minutes
Average time to close pull requests: 1 minute
Total issue authors: 9
Total pull request authors: 2
Average comments per issue: 0.82
Average comments per pull request: 0.0
Merged pull requests: 0
Bot issues: 0
Bot pull requests: 0

Past Year

Issues: 0
Pull requests: 0
Average time to close issues: N/A
Average time to close pull requests: N/A
Issue authors: 0
Pull request authors: 0
Average comments per issue: 0
Average comments per pull request: 0
Merged pull requests: 0
Bot issues: 0
Bot pull requests: 0

View more stats

Top Authors

Issue Authors

DewEfresh (1)
thwannbe (1)
ttl10101 (1)
nevakrien (1)
chuxiliyixiaosa (1)
klei22 (1)
Ywandung-Lyou (1)
darkman111a (1)

https://github.com/beomi/bitnet-transformers

Science Score: 23.0%

Keywords

Repository

Basic Info

Statistics

Topics

Metadata Files

README.md

0️⃣1️⃣🤗 BitNet-Transformers: Huggingface Transformers Implementation of "BitNet: Scaling 1-bit Transformers for Large Language Models" in pytorch with Llama(2) Architecture

Prepare Dev env

Clone this repo

Install requirements

Clone transformers repo

Update Llama(2) model

Train Wikitext-103

GPU Mem Usage Comparison

Todo

GitHub Events

Total

Last Year

Committers

All Time

Past Year

Top Committers

Committer Domains (Top 20 + Academic)

Issues and Pull Requests

All Time

Past Year

Top Authors

Issue Authors

Pull Request Authors

Top Labels

Issue Labels

Pull Request Labels