Updated 4 months ago

qonnx • Rank 18.7 • Science 77%

QONNX: Arbitrary-Precision Quantized Neural Networks in ONNX

Updated 4 months ago

torchao • Rank 25.9 • Science 64%

PyTorch native quantization and sparsity for training and inference

Updated 4 months ago

onnx2tf • Rank 22.1 • Science 67%

Self-Created Tools to convert ONNX files (NCHW) to TensorFlow/TFLite/Keras format (NHWC). The purpose of this tool is to solve the massive Transpose extrapolation problem in onnx-tensorflow (onnx-tf). I don't need a Star, but give me a pull request.

Updated 4 months ago

llmcompressor • Rank 22.6 • Science 54%

Transformers-compatible library for applying various compression algorithms to LLMs for optimized deployment with vLLM

Updated 4 months ago

q-galore • Rank 5.3 • Science 54%

Q-GaLore: Quantized GaLore with INT4 Projection and Layer-Adaptive Low-Rank Gradients.

Updated 4 months ago

tf2deepfloorplan • Rank 6.2 • Science 51%

TF2 Deep FloorPlan Recognition using a Multi-task Network with Room-boundary-Guided Attention. Enable tensorboard, quantization, flask, tflite, docker, github actions and google colab.

Updated 4 months ago

master-thesis • Science 44%

One Bit at a Time: Impact of Quantisation on Neural Machine Translation

Updated 4 months ago

psumsim • Science 44%

PSumSim: A Simulator for Partial-Sum Quantization in Analog Matrix-Vector Multipliers