Recent Releases of gpullama3.java
gpullama3.java - v0.1.0-beta
- Llama 3 model compatibility - Full support for Llama 3.0, 3.1, and 3.2 models
- GGUF format support - Native handling of GGUF model files
- Support for FP16 models for reduced memory usage and faster computation
- GPU Acceleration on NVIDIA GPUs using both OpenCL and PTX backends
- [Experimental] Support for Apple Silicon (M1/M2/M3) via OpenCL (subject to hardware/compiler limitations)
- [Experimental] Initial support for Q8 and Q4 quantized models, using runtime dequantization to FP16
- Java
Published by mikepapadim about 1 year ago