Updated 6 months ago
https://github.com/libxsmm/libxsmm
Library for specialized dense and sparse matrix operations, and deep learning primitives.
Updated 5 months ago
https://github.com/ashvardanian/simsimd
Up to 200x Faster Dot Products & Similarity Metrics — for Python, Rust, C, JS, and Swift, supporting f64, f32, f16 real & complex, i8, and bit vectors using SIMD for both AVX2, AVX-512, NEON, SVE, & SVE2 📐
Updated 6 months ago
kernel_float
CUDA/HIP header-only library for low-precision (16 bit, 8 bit) and vectorized GPU kernel development