Recent Releases of gpullama3.java

Llama 3 model compatibility - Full support for Llama 3.0, 3.1, and 3.2 models
GGUF format support - Native handling of GGUF model files
Support for FP16 models for reduced memory usage and faster computation
GPU Acceleration on NVIDIA GPUs using both OpenCL and PTX backends
[Experimental] Support for Apple Silicon (M1/M2/M3) via OpenCL (subject to hardware/compiler limitations)
[Experimental] Initial support for Q8 and Q4 quantized models, using runtime dequantization to FP16

- Java
Published by mikepapadim about 1 year ago

ecosyste.ms