Recent Releases of gpullama3.java

gpullama3.java - v0.1.0-beta

  • Llama 3 model compatibility - Full support for Llama 3.0, 3.1, and 3.2 models
  • GGUF format support - Native handling of GGUF model files
  • Support for FP16 models for reduced memory usage and faster computation
  • GPU Acceleration on NVIDIA GPUs using both OpenCL and PTX backends
  • [Experimental] Support for Apple Silicon (M1/M2/M3) via OpenCL (subject to hardware/compiler limitations)
  • [Experimental] Initial support for Q8 and Q4 quantized models, using runtime dequantization to FP16

- Java
Published by mikepapadim about 1 year ago