mpi4jax
mpi4jax: Zero-copy MPI communication of JAX arrays - Published in JOSS (2021)
UltraNest - a robust, general purpose Bayesian inference engine
UltraNest - a robust, general purpose Bayesian inference engine - Published in JOSS (2021)
schwimmbad
schwimmbad: A uniform interface to parallel processing pools in Python - Published in JOSS (2017)
t8code - modular adaptive mesh refinement in the exascale era
t8code - modular adaptive mesh refinement in the exascale era - Published in JOSS (2025)
fmcmc
fmcmc: A friendly MCMC framework - Published in JOSS (2019)
GridapDistributed
GridapDistributed: a massively parallel finite element toolbox in Julia - Published in JOSS (2022)
Chips-n-Salsa
Chips-n-Salsa: A Java Library of Customizable, Hybridizable, Iterative, Parallel, Stochastic, and Self-Adaptive Local Search Algorithms - Published in JOSS (2020)
Caffeine
Caffeine: A parallel runtime library for supporting modern Fortran compilers - Published in JOSS (2025)
CLAIRE
CLAIRE: Constrained Large Deformation Diffeomorphic Image Registration on Parallel Computing Architectures - Published in JOSS (2021)
SARAS
SARAS: A general-purpose PDE solver for fluid dynamics - Published in JOSS (2021)
ACHR.cu
ACHR.cu: GPU-accelerated sampling of metabolic networks - Published in JOSS (2019)
pysdc
pySDC is a Python implementation of the spectral deferred correction (SDC) approach and its flavors, esp. the multilevel extension MLSDC and PFASST.
wrapyfi
Robotics MOM and RPC middleware wrapper with deep-learning framework integration
https://github.com/stfc/psyclone
PSyclone is a source-to-source Fortran compiler designed to programmatically optimise, parallelise and instrument HPC applications via user-provided transformation scripts.
batchtools
batchtools: Tools for R to work on batch systems - Published in JOSS (2017)
pygmo
A Python platform to perform parallel computations of optimisation tasks (global and local) via the asynchronous generalized island model.
ppdyn
Plasma Particle Dynamics (PPDyn), a python code to simulate plasma particles using Molecular Dynamics Algorithm. Numba JIT compiler for Python has been implemented for faster performance. Detailed documentation can be found at https://ppdyn.readthedocs.io/.
sundials
Official development repository for SUNDIALS - a SUite of Nonlinear and DIfferential/ALgebraic equation Solvers. Pull requests are welcome for bug fixes and minor changes.
parafilt
Collection of parallel adaptive filter implementations for efficient signal processing applications in PyTorch.
pyhpc-benchmarks
A suite of benchmarks for CPU and GPU performance of the most popular high-performance libraries for Python :rocket:
rockyml
⛰️ RockyML - A High-Performance Scientific Computing Framework for Non-smooth Machine Learning Problems
cato
Automatic source transformation to apply HPC frameworks with minimal user interaction
kokkos
Kokkos C++ Performance Portability Programming Ecosystem: The Programming Model - Parallel Execution and Memory Abstraction
pestpp
tools for scalable and non-intrusive parameter estimation, uncertainty analysis and sensitivity analysis
adaptive
:chart_with_upwards_trend: Adaptive: parallel active learning of mathematical functions
p4pdes
C and Python examples from my book on using PETSc and Firedrake to solve PDEs
future
:rocket: R package: future: Unified Parallel and Distributed Processing in R for Everyone
discostic-sim
A cross-architecture resource-based parallel simulation framework that can efficiently predict the performance of real or hypothetical massively parallel MPI programs on current and future heterogeneous systems.
https://github.com/madsjulia/robustpmap.jl
Robust pmap calls for efficient parallelization and high-performance computing
https://github.com/futureverse/future.apply
:rocket: R package: future.apply - Apply Function to Elements in Parallel using Futures
pararealgpu.jl
A distributed and GPU-based implementation of the Parareal algorithm for parallel-in-time integration of equations of motion.
https://github.com/cran-task-views/highperformancecomputing
CRAN Task View: High-Performance and Parallel Computing with R
https://github.com/radiantone/blazer
An HPC abstraction over MPI with built-in parallel compute primitives
https://github.com/actinia-org/actinia-parallel-plugin
This is the actinia parallel plugin for faster processing (WIP).
doParabar
An `R` package that provides a `foreach` parallel adaptor for `parabar` backends.
optimagic
optimagic is a Python package for numerical optimization. It is a unified interface to optimizers from SciPy, NlOpt and other packages. optimagic's minimize function works just like SciPy's, so you don't have to adjust your code. You simply get more optimizers for free. On top you get diagnostic tools, parallel numerical derivatives and more.
pyqkd
Repository with codes for simulating and optimising quantum key distribution protocols.
https://github.com/bkraad47/fat_llama
fat_llama is a Python package for upscaling audio files to FLAC or WAV formats using advanced audio processing techniques. It utilizes CUDA-accelerated calculations to enhance audio quality by upsampling and adding missing frequencies through FFT, resulting in richer and more detailed audio.
libfork
A bleeding-edge, lock-free, wait-free, continuation-stealing tasking library built on C++20's coroutines
mcmlgpu
This repository contains the base code for Monte Carlo simulations in a GPU of light transport on turbid media in GPU.
https://github.com/bluebrain/bluepyparallel
Provides an embarrassingly parallel tool with sql backend
librom
Data-driven model reduction library with an emphasis on large scale parallelism and linear subspace methods
https://github.com/beehive-lab/tornadovm
TornadoVM: A practical and efficient heterogeneous programming framework for managed languages
chunklist
A Chunk List is a new, concurrent, chunk-based data structure that is easily modifiable and allows for fast run-time operations.
https://github.com/kahypar/mt-kahypar
Mt-KaHyPar (Multi-Threaded Karlsruhe Hypergraph Partitioner) is a shared-memory multilevel graph and hypergraph partitioner equipped with parallel implementations of techniques used in the best sequential partitioning algorithms. Mt-KaHyPar can partition extremely large hypergraphs very fast and with high quality.
MaterialPointSolver
🧮 High-performance Material Point Method (MPM) Solver in Julia.
https://github.com/chiang-yuan/culsm
CUDA C++ code implementing GPU-accelerated Lattice Spring Model (CuLSM) simulations.
tomobear
TomoBEAR is a configurable and customizable modular pipeline for streamlined processing of cryo-electron tomographic data for subtomogram averaging.
portageclusterutils
Cluster and parallelization utilities that came ouf of the Portage SMT project — Outils de parallélisation sur grappe de calcul issus du projet Portage de TAS
https://github.com/bamresearch/analyse_dls_with_contin
This repository contains Python code and a Jupyter Notebook running the original CONTIN program by S. Provencher on every DLS measurement (dynamic light scattering, aka. photon correlation spectroscopy, PCS) read from *.ASC files.
https://github.com/jianqoq/hpt
A high performance N-dimensional array library for Rust
propulate
Propulate is an asynchronous population-based optimization algorithm and software package for global optimization and hyperparameter search on high-performance computers.
enance-amamento
A template library for headless rendering of Signed Distance Fields based on OpenMP.
charm
The Charm++ parallel programming system. Visit https://charmplusplus.org/ for more information.
https://github.com/cmkobel/mspipeline1
🪆🦖 A snakemake wrapper around Nesvilab's FragPipe-CLI. In a perfect world, this pipeline was based on Sage.
https://github.com/aaronjs99/planmux
PlanMux: Path Planning using Parallel/Multiplexed Computing