https://github.com/beomi/bitnet-transformers

0️⃣1️⃣🤗 BitNet-Transformers: Huggingface Transformers Implementation of "BitNet: Scaling 1-bit Transformers for Large Language Models" in pytorch with Llama(2) Architecture

https://github.com/beomi/bitnet-transformers

Science Score: 23.0%

This score indicates how likely this project is to be science-related based on various indicators:

  • CITATION.cff file
  • codemeta.json file
    Found codemeta.json file
  • .zenodo.json file
  • DOI references
  • Academic publication links
    Links to: arxiv.org
  • Committers with academic emails
  • Institutional organization owner
  • JOSS paper metadata
  • Scientific vocabulary similarity
    Low similarity (7.9%) to scientific vocabulary

Keywords

llm quantization quantization-aware-training transformers
Last synced: 5 months ago · JSON representation

Repository

0️⃣1️⃣🤗 BitNet-Transformers: Huggingface Transformers Implementation of "BitNet: Scaling 1-bit Transformers for Large Language Models" in pytorch with Llama(2) Architecture

Basic Info
  • Host: GitHub
  • Owner: Beomi
  • Language: Python
  • Default Branch: main
  • Homepage:
  • Size: 588 KB
Statistics
  • Stars: 302
  • Watchers: 9
  • Forks: 32
  • Open Issues: 8
  • Releases: 0
Topics
llm quantization quantization-aware-training transformers
Created over 2 years ago · Last pushed almost 2 years ago
Metadata Files
Readme

README.md

0️⃣1️⃣🤗 BitNet-Transformers: Huggingface Transformers Implementation of "BitNet: Scaling 1-bit Transformers for Large Language Models" in pytorch with Llama(2) Architecture

BitNet Architecture

BitNet

  • Paper Link: https://arxiv.org/pdf/2310.11453.pdf

Prepare Dev env

```bash

Clone this repo

git clone https://github.com/beomi/bitnet-transformers cd bitnet-transformers

Install requirements

pip install -r clm_requirements.txt

Clone transformers repo

git clone https://github.com/huggingface/transformers pip install -e transformers

Update Llama(2) model

rm ./transformers/src/transformers/models/llama/modelingllama.py ln -s $(pwd)/bitnetllama/modelingllama.py ./transformers/src/transformers/models/llama/modelingllama.py ```

We'll overwrite bitnet_llama/modeling_llama.py into transformers. Since the file is linked, any changes made to the file will be reflected in the transformers repo.

Train Wikitext-103

Train Loss Graph when train BitLLAMA using Wikitext-103

You can track metrics via wandb

bash ./train_wikitext.sh

GPU Mem Usage Comparison

Train Config

  • Batch size: 1
  • Gradient accumulation: 1
  • Seq length: 2048
  • Model: LLamaForCausalLM with BitLinear layer
  • Model size: 47,452,672 (47.5M)

Original LLAMA - 16bit

  • Uses 250MB GPU memory for Model weights

BitLLAMA - Mixed 16bit

  • Uses 200MB GPU memory for Model weights
  • Use bf16(or fp16) to store model weights
  • Use int8 to store -1/1 1-bit weights
  • Use more memory when training than original LLAMA: It saves 1-bit weight and 16bit weight together

BitLLAMA - 8bit

  • Uses 100MB GPU memory for Model weights
  • Use bf16(or fp16) on-the-fly when needed
  • Use 8bit to save 1-bit BitLinear weight & other weights

BitLLAMA - 1bit

  • Use bf16(or fp16) on-the-fly when needed
  • Use 1bit to save 1-bit weight

bash TBD

Todo

  • [x] Add BitLinear layer
  • [x] Add LLamaForCausalLM model with BitLinear layer
    • [x] Update .save_pretrained method (for 1-bit weight saving)
  • [x] Add sample code for LM training
  • [ ] Update BitLinear layer to use 1-bit weight
    • [ ] Use uint8 instead of bfloat16
    • [ ] Use custom cuda kernel for 1-bit weight

GitHub Events

Total
  • Watch event: 45
  • Issue comment event: 1
  • Fork event: 4
Last Year
  • Watch event: 45
  • Issue comment event: 1
  • Fork event: 4

Committers

Last synced: 9 months ago

All Time
  • Total Commits: 18
  • Total Committers: 1
  • Avg Commits per committer: 18.0
  • Development Distribution Score (DDS): 0.0
Past Year
  • Commits: 0
  • Committers: 0
  • Avg Commits per committer: 0.0
  • Development Distribution Score (DDS): 0.0
Top Committers
Name Email Commits
junbum lee j****n@b****t 18
Committer Domains (Top 20 + Academic)

Issues and Pull Requests

Last synced: 9 months ago

All Time
  • Total issues: 11
  • Total pull requests: 2
  • Average time to close issues: 3 minutes
  • Average time to close pull requests: 1 minute
  • Total issue authors: 9
  • Total pull request authors: 2
  • Average comments per issue: 0.82
  • Average comments per pull request: 0.0
  • Merged pull requests: 0
  • Bot issues: 0
  • Bot pull requests: 0
Past Year
  • Issues: 0
  • Pull requests: 0
  • Average time to close issues: N/A
  • Average time to close pull requests: N/A
  • Issue authors: 0
  • Pull request authors: 0
  • Average comments per issue: 0
  • Average comments per pull request: 0
  • Merged pull requests: 0
  • Bot issues: 0
  • Bot pull requests: 0
Top Authors
Issue Authors
  • DewEfresh (1)
  • thwannbe (1)
  • ttl10101 (1)
  • nevakrien (1)
  • chuxiliyixiaosa (1)
  • klei22 (1)
  • Ywandung-Lyou (1)
  • darkman111a (1)
Pull Request Authors
  • eltociear (2)
  • gtpk (2)
Top Labels
Issue Labels
Pull Request Labels