Updated 6 months ago

qonnx • Rank 18.7 • Science 77%

QONNX: Arbitrary-Precision Quantized Neural Networks in ONNX

Updated 6 months ago

torchao • Rank 25.9 • Science 64%

PyTorch native quantization and sparsity for training and inference

Updated 6 months ago

onnx2tf • Rank 22.1 • Science 67%

Self-Created Tools to convert ONNX files (NCHW) to TensorFlow/TFLite/Keras format (NHWC). The purpose of this tool is to solve the massive Transpose extrapolation problem in onnx-tensorflow (onnx-tf). I don't need a Star, but give me a pull request.

Updated 6 months ago

llmcompressor • Rank 22.6 • Science 54%

Transformers-compatible library for applying various compression algorithms to LLMs for optimized deployment with vLLM

Updated 6 months ago

optimum • Rank 27.2 • Science 36%

🚀 Accelerate inference and training of 🤗 Transformers, Diffusers, TIMM and Sentence Transformers with easy to use hardware optimization tools

Updated 6 months ago

q-galore • Rank 5.3 • Science 54%

Q-GaLore: Quantized GaLore with INT4 Projection and Layer-Adaptive Low-Rank Gradients.

Updated 6 months ago

chinese-llama-alpaca • Rank 12.2 • Science 46%

中文LLaMA&Alpaca大语言模型+本地CPU/GPU训练部署 (Chinese LLaMA & Alpaca LLMs)

Updated 6 months ago

tf2deepfloorplan • Rank 6.2 • Science 51%

TF2 Deep FloorPlan Recognition using a Multi-task Network with Room-boundary-Guided Attention. Enable tensorboard, quantization, flask, tflite, docker, github actions and google colab.

Updated 5 months ago

https://github.com/mobiusml/hqq • Rank 19.5 • Science 36%

Official implementation of Half-Quadratic Quantization (HQQ)

Updated 5 months ago

https://github.com/cedrickchee/awesome-ml-model-compression • Rank 7.9 • Science 46%

Awesome machine learning model compression research papers, quantization, tools, and learning material.

Updated 5 months ago

https://github.com/beomi/bitnet-transformers • Rank 5.7 • Science 23%

0️⃣1️⃣🤗 BitNet-Transformers: Huggingface Transformers Implementation of "BitNet: Scaling 1-bit Transformers for Large Language Models" in pytorch with Llama(2) Architecture

Updated 6 months ago

master-thesis • Science 44%

One Bit at a Time: Impact of Quantisation on Neural Machine Translation

Updated 6 months ago

hailo_model_zoo • Science 26%

The Hailo Model Zoo includes pre-trained models and a full building and evaluation environment

Updated 6 months ago

psumsim • Science 44%

PSumSim: A Simulator for Partial-Sum Quantization in Analog Matrix-Vector Multipliers