opt\_einsum - A Python package for optimizing contraction order for einsum-like expressions
opt\_einsum - A Python package for optimizing contraction order for einsum-like expressions - Published in JOSS (2018)
PeleLMeX
PeleLMeX: an AMR Low Mach Number Reactive Flow Simulation Code without level sub-cycling - Published in JOSS (2023)
`hessQuik`
`hessQuik`: Fast Hessian computation of composite functions - Published in JOSS (2022)
Krang
Krang: Kerr Raytracer for Analytic Null Geodesics - Published in JOSS (2024)
KomaMRI
Koma is a Pulseq-compatible framework to efficiently simulate Magnetic Resonance Imaging (MRI) acquisitions. The main focus of this package is to simulate general scenarios that could arise in pulse sequence development.
montecarlomeasurements.jl
Propagation of distributions by Monte-Carlo sampling: Real number types with uncertainty represented by samples.
https://github.com/tensorflow/tfjs
A WebGL accelerated JavaScript library for training and deploying ML models.
https://github.com/clesperanto/pyclesperanto_prototype
GPU-accelerated bio-image analysis focusing on 3D+t microscopy image data
https://github.com/conradsnicta/bandicoot-code
Bandicoot: C++ library for GPU linear algebra & scientific computing - https://coot.sourceforge.io
https://github.com/beehive-lab/kfusion-tornadovm
🎥 A Java implementation of Kinect Fusion running on Tornado VM.
https://github.com/beehive-lab/tornadovm
TornadoVM: A practical and efficient heterogeneous programming framework for managed languages
kmm
KMM: parallel dataflow scheduler and efficient memory management for multi-GPU platforms
parsec
PaRSEC is a generic framework for architecture aware scheduling and management of micro-tasks on distributed, GPU accelerated, many-core heterogeneous architectures. PaRSEC assigns computation threads to the cores, GPU accelerators, overlaps communications and computations and uses a dynamic, fully-distributed scheduler based on architectural features such as NUMA nodes and algorithmic features such as data reuse.
waveletml
WaveletML: A Scalable and Extensible Wavelet Neural Network Framework
aestream
Efficient streaming of sparse event data supporting files, network I/O, GPU peripherals (via Torch/Jax/Numpy) and neuromorphic protocols
air-traffic-distribution
A GPU-Accelerated Multi-Objective Genetic Algorithm for Air Traffic Management
dplasma
DPLASMA is a highly optimized, accelerator-aware, implementation of a dense linear algebra package for distributed heterogeneous systems. It is designed to deliver sustained performance for distributed systems where each node featuring multiple sockets of multicore processors, and if available, accelerators, using the PaRSEC runtime as a backend.
elphbolt
A solver for the coupled and decoupled electron and phonon Boltzmann transport equations.
sagecal
SAGECal is a fast, memory efficient and GPU accelerated radio interferometric calibration program. It supports all source models including points, Gaussians and Shapelets. Distributed calibration using MPI and consensus optimization is enabled. Both spectral and spatial priors can be used as constraints. Tools to build/restore sky models are included.
gpu_programming_beginner
Fundamentals of heterogeneous parallel programming with CUDA C/C++ at the beginner level.