libCEED
libCEED: Fast algebra for high-order element-based discretizations - Published in JOSS (2021)
mpi4jax
mpi4jax: Zero-copy MPI communication of JAX arrays - Published in JOSS (2021)
torchquad
torchquad: Numerical Integration in Arbitrary Dimensions with PyTorch - Published in JOSS (2021)
Monte Carlo / Dynamic Code (MC/DC)
Monte Carlo / Dynamic Code (MC/DC): An accelerated Python package for fully transient neutron transport and rapid methods development - Published in JOSS (2024)
AMReX
AMReX: a framework for block-structured adaptive mesh refinement - Published in JOSS (2019)
Cabana
Cabana: A Performance Portable Library for Particle-Based Simulations - Published in JOSS (2022)
t8code - modular adaptive mesh refinement in the exascale era
t8code - modular adaptive mesh refinement in the exascale era - Published in JOSS (2025)
madupite
madupite: A High-Performance Distributed Solver for Large-Scale Markov Decision Processes - Published in JOSS (2025)
BracketingNonlinearSolve
High-performance and differentiation-enabled nonlinear solvers (Newton methods), bracketed rootfinding (bisection, Falsi), with sparsity and Newton-Krylov support.
parpe
Parameter estimation for dynamical models using high-performance computing, batch and mini-batch optimizers, and dynamic load balancing.
t-elf
Tensor Extraction of Latent Features (T-ELF). Within T-ELF's arsenal are non-negative matrix and tensor factorization solutions, equipped with automatic model determination (also known as the estimation of latent factors - rank) for accurate data modeling. Our software suite encompasses cutting-edge data pre-processing and post-processing modules.
jurassic
The Juelich Rapid Spectral Simulation Code (JURASSIC) is a fast infrared radiative transfer model for the analysis of atmospheric remote sensing measurements.
gps
This repository provides a collection of codes for the analysis of GPS/RO observations.
iasi
This repository provides a collection of codes for the analysis of remote sensing observations of EUMETSAT's Infrared Atmospheric Sounding Interferometer (IASI).
fluidx3d
The fastest and most memory efficient lattice Boltzmann CFD software, running on all GPUs and CPUs via OpenCL. Free for non-commercial use.
calibr
Parallelized Bayesian calibration of simulations using Gaussian process emulation
biomero
BIOMERO - A python library for easy connecting between OMERO (jobs) and a Slurm cluster
sundials
Official development repository for SUNDIALS - a SUite of Nonlinear and DIfferential/ALgebraic equation Solvers. Pull requests are welcome for bug fixes and minor changes.
pyhpc-benchmarks
A suite of benchmarks for CPU and GPU performance of the most popular high-performance libraries for Python :rocket:
PoissonRandom
Fast Poisson Random Numbers in pure Julia for scientific machine learning (SciML)
kokkos
Kokkos C++ Performance Portability Programming Ecosystem: The Programming Model - Parallel Execution and Memory Abstraction
PreallocationTools
Tools for building non-allocating pre-cached functions in Julia, allowing for GC-free usage of automatic differentiation in complex codes
ntpoly
A massively parallel library for computing the functions of sparse matrices.
librapid
A highly optimised C++ library for mathematical applications and neural networks.
opencl-benchmark
A small OpenCL benchmark program to measure peak GPU/CPU performance.
pararealgpu.jl
A distributed and GPU-based implementation of the Parareal algorithm for parallel-in-time integration of equations of motion.
entropy-core-problem
Repository for suppelementary material from my publications on the entropy core problem
dplasma
DPLASMA is a highly optimized, accelerator-aware, implementation of a dense linear algebra package for distributed heterogeneous systems. It is designed to deliver sustained performance for distributed systems where each node featuring multiple sockets of multicore processors, and if available, accelerators, using the PaRSEC runtime as a backend.
howtospinfoamamplitude
Notebooks and data with practical introduction to numerical computations of spinfoam amplitudes in LQG.
flexi
FLEXI: A high order discontinuous Galerkin framework for hyperbolic–parabolic conservation laws
software-engineer
A curated learning repository focused on High-Performance Computing (HPC) — covering fundamentals to advanced topics in CUDA, MPI, C++, and Python-C++ interoperability.
parsec
PaRSEC is a generic framework for architecture aware scheduling and management of micro-tasks on distributed, GPU accelerated, many-core heterogeneous architectures. PaRSEC assigns computation threads to the cores, GPU accelerators, overlaps communications and computations and uses a dynamic, fully-distributed scheduler based on architectural features such as NUMA nodes and algorithmic features such as data reuse.
rotational-ksz-macsis
Repository for suppelementary material from my publication on the rotational kinetic SZ effect in MACSIS
hpc_submit_scripts
Slurm job scripts to run and analyze molecular dynamics simulations on high performance computers
delicoco-ieee-transactions
In compressed decentralized optimization settings, there are benefits to having multiple gossip steps between subsequent gradient iterations, even when the cost of doing so is appropriately accounted for e.g. by means of reducing the precision of compressed information.
masters-thesis
Monitoring parallel file system usage in a high-performance computer cluster
thread-pool
BS::thread_pool: a fast, lightweight, modern, and easy-to-use C++17 / C++20 / C++23 thread pool library
spinfoam_radiative_corrections
Codes for the computation of the radiative corrections to the EPRL spin foam propagator in covariant LQG.
fans
FANS: an open-source, efficient, and parallel FFT-based homogenization solver designed to solve microscale multiphysics problems.
blacktowhitehole
Codes and notebooks for the computation of the Black-to-White hole transition amplitude and the crossing time in covariant LQG.
parallel-etudes
Parallel programming assignments using OpenMP, MPI and CUDA/OpenCL