mpi4jax
mpi4jax: Zero-copy MPI communication of JAX arrays - Published in JOSS (2021)
t8code - modular adaptive mesh refinement in the exascale era
t8code - modular adaptive mesh refinement in the exascale era - Published in JOSS (2025)
Chips-n-Salsa
Chips-n-Salsa: A Java Library of Customizable, Hybridizable, Iterative, Parallel, Stochastic, and Self-Adaptive Local Search Algorithms - Published in JOSS (2020)
pysdc
pySDC is a Python implementation of the spectral deferred correction (SDC) approach and its flavors, esp. the multilevel extension MLSDC and PFASST.
wrapyfi
Robotics MOM and RPC middleware wrapper with deep-learning framework integration
ppdyn
Plasma Particle Dynamics (PPDyn), a python code to simulate plasma particles using Molecular Dynamics Algorithm. Numba JIT compiler for Python has been implemented for faster performance. Detailed documentation can be found at https://ppdyn.readthedocs.io/.
sundials
Official development repository for SUNDIALS - a SUite of Nonlinear and DIfferential/ALgebraic equation Solvers. Pull requests are welcome for bug fixes and minor changes.
parafilt
Collection of parallel adaptive filter implementations for efficient signal processing applications in PyTorch.
pyhpc-benchmarks
A suite of benchmarks for CPU and GPU performance of the most popular high-performance libraries for Python :rocket:
rockyml
⛰️ RockyML - A High-Performance Scientific Computing Framework for Non-smooth Machine Learning Problems
cato
Automatic source transformation to apply HPC frameworks with minimal user interaction
kokkos
Kokkos C++ Performance Portability Programming Ecosystem: The Programming Model - Parallel Execution and Memory Abstraction
p4pdes
C and Python examples from my book on using PETSc and Firedrake to solve PDEs
pararealgpu.jl
A distributed and GPU-based implementation of the Parareal algorithm for parallel-in-time integration of equations of motion.
mcmlgpu
This repository contains the base code for Monte Carlo simulations in a GPU of light transport on turbid media in GPU.
enance-amamento
A template library for headless rendering of Signed Distance Fields based on OpenMP.
tomobear
TomoBEAR is a configurable and customizable modular pipeline for streamlined processing of cryo-electron tomographic data for subtomogram averaging.
portageclusterutils
Cluster and parallelization utilities that came ouf of the Portage SMT project — Outils de parallélisation sur grappe de calcul issus du projet Portage de TAS
chunklist
A Chunk List is a new, concurrent, chunk-based data structure that is easily modifiable and allows for fast run-time operations.
pyqkd
Repository with codes for simulating and optimising quantum key distribution protocols.
charm
The Charm++ parallel programming system. Visit https://charmplusplus.org/ for more information.
MaterialPointSolver
🧮 High-performance Material Point Method (MPM) Solver in Julia.
librom
Data-driven model reduction library with an emphasis on large scale parallelism and linear subspace methods
propulate
Propulate is an asynchronous population-based optimization algorithm and software package for global optimization and hyperparameter search on high-performance computers.