Projects with CITATION.cff | Open Source Science

Scientific Software

Updated 11 months ago

libCEED — Peer-reviewed • Rank 18.8 • Science 100%

libCEED: Fast algebra for high-order element-based discretizations - Published in JOSS (2021)

api ceed cuda ecp exascale-computing gpu high-order high-performance-computing hpc julia linear-algebra

Economics (40%)

Scientific Software · Peer-reviewed

Scientific Software

Updated 11 months ago

mpi4jax — Peer-reviewed • Rank 14.6 • Science 100%

mpi4jax: Zero-copy MPI communication of JAX arrays - Published in JOSS (2021)

gpu high-performance-computing jax jit mpi parallel-computing xla

Scientific Software · Peer-reviewed

Scientific Software

Updated 11 months ago

torchquad — Peer-reviewed • Rank 14.3 • Science 100%

torchquad: Numerical Integration in Arbitrary Dimensions with PyTorch - Published in JOSS (2021)

automatic-differentiation gpu high-performance-computing integration machine-learning monte-carlo-integration multidimensional-integration numerical-integration python pytorch torchquad vegas vegas-enhanced

Scientific Software · Peer-reviewed

Scientific Software

Updated 11 months ago

GeophysicalFlows.jl — Peer-reviewed • Rank 9.8 • Science 100%

GeophysicalFlows.jl: Solvers for geophysical fluid dynamics problems in periodic domains on CPUs & GPUs - Published in JOSS (2021)

baroclinic fourierflows geophysical-fluid-dynamics gpu julia navier-stokes pdes qg quasigeostrophy sqg

Scientific Software · Peer-reviewed

Scientific Software

Updated 11 months ago

Oceananigans.jl — Peer-reviewed • Rank 17.4 • Science 85%

Oceananigans.jl: Fast and friendly geophysical fluid dynamics on GPUs - Published in JOSS (2020)

climate climate-change data-assimilation fluid-dynamics gpu julia machine-learning ocean

Scientific Software · Peer-reviewed

Scientific Software

Updated 11 months ago

Makie.jl — Peer-reviewed • Rank 23.8 • Science 77%

Makie.jl: Flexible high-performance data visualization for Julia - Published in JOSS (2021)

gpu graphics julia julia-language plotting visualization

Scientific Software · Peer-reviewed

Updated 11 months ago

veros • Rank 14.5 • Science 85%

The versatile ocean simulator, in pure Python, powered by JAX.

climate distributed geophysics gpu jax multi-core oceanography parallel python

Updated 11 months ago

CUDA • Rank 21.1 • Science 77%

CUDA programming in Julia.

cuda gpu hacktoberfest julia

Updated 11 months ago

umpire • Rank 12.9 • Science 85%

An application-focused API for memory management on NUMA & GPU architectures

blt cpp gpu hpc memory-management portability radiuss

Updated 11 months ago

heat • Rank 16.8 • Science 77%

Distributed tensors and Machine Learning framework with GPU and MPI acceleration in Python

array-api data-analytics data-processing data-science distributed gpu hpc machine-learning massive-datasets mpi mpi4py multi-gpu multi-node-cluster numpy parallelism python pytorch tensors

Updated 11 months ago

norse • Rank 16.4 • Science 77%

Deep learning with spiking neural networks (SNNs) in PyTorch.

autograd deep-learning gpu machine-learning neural-network neuromorphic pytorch pytorch-lightning spiking-neural-networks tensor

Updated 11 months ago

kernel-tuner • Rank 14.6 • Science 77%

Kernel Tuner

auto-tuning autotuning c cplusplus cuda cuda-kernels gpu gpu-computing kernel-tuner machine-learning opencl opencl-kernels optimization python software-development testing

Updated 11 months ago

gpullama3.java • Rank 6.6 • Science 85%

GPU-accelerated Llama3.java inference in pure Java using TornadoVM.

accelerators compilers deepseek-r1 gguf gpu java java21 llama3 llm mistral mistral-7b nvidia phi-3 phi-3-mini qwen2-5 qwen3 tornadovm

Updated 11 months ago

t-elf • Rank 5.5 • Science 85%

Tensor Extraction of Latent Features (T-ELF). Within T-ELF's arsenal are non-negative matrix and tensor factorization solutions, equipped with automatic model determination (also known as the estimation of latent factors - rank) for accurate data modeling. Our software suite encompasses cutting-edge data pre-processing and post-processing modules.

blind-source-separation dimensionality-reduction feature-extraction gpu high-performance-computing hpc latent-variables machine-learning matrix matrix-completion matrix-factorization non-negative-matrix-factorization pattern-extraction semi-supervised-learning tensor-decomposition tensor-factorization tensors text-preprocessing unsupervised-learning

Updated 11 months ago

scalene • Rank 26.0 • Science 64%

Scalene: a high-performance, high-precision CPU, GPU, and memory profiler for Python with AI-powered optimization proposals

cpu cpu-profiling gpu gpu-programming memory-allocation memory-consumption performance-analysis performance-cpu profiler profiles-memory profiling python python-profilers scalene

Updated 11 months ago

tensorcircuit-ng • Rank 12.6 • Science 77%

The next-gen tensor network based quantum software framework: superseding the original TensorCircuit

automatic-differentiation distributed-training gpu jax machine-learning neural-network nisq open-quantum-systems pytorch quantum-algorithm quantum-circuit quantum-computing quantum-dynamics quantum-hardware quantum-machine-learning quantum-noise quantum-simulation tensor-network tensorflow

Updated 11 months ago

brian2cuda • Rank 11.8 • Science 77%

A brian2 extension to simulate spiking neural networks on GPUs

biological-simulations brian brian2 code-generation computational-neuroscience differential-equations gpu gpu-acceleration neuroscience python science simulation simulation-framework spiking-neural-networks

Updated 11 months ago

terragpu • Rank 3.3 • Science 85%

Python library to process and classify remote sensing imagery by means of GPUs and ML.

ai cudf cupy cuspatial dask earth-science geopandas gpu ml numpy raster vector

Updated 11 months ago

pycuda • Rank 23.1 • Science 64%

CUDA integration for Python, plus shiny features

array cuda gpu gpu-computing multidimensional-arrays pycuda python scientific-computing

Updated 11 months ago

gunrock • Rank 10.0 • Science 77%

Programmable CUDA/C++ GPU Graph Analytics

algorithm algorithms cpp cuda cxx essentials gnn gpu graph graph-algorithms graph-analytics graph-engine graph-neural-networks graph-primitives graph-processing gunrock hpc parallel-computing sparse-matrix

Updated 11 months ago

pyopencl • Rank 23.0 • Science 64%

OpenCL integration for Python, plus shiny features

amd array cuda gpu heterogeneous-parallel-programming multidimensional-arrays nvidia opencl opengl parallel-algorithm parallel-computing performance prefix-sum pyopencl python reduction scientific-computing shared-memory sorting

Updated 11 months ago

pennylane-lightning • Rank 21.8 • Science 64%

The Lightning plugin ecosystem provides fast quantum state-vector and tensor network simulators written in C++ for use with PennyLane.

cuda distributed-computing gpu hpc mpi openmp parallel quantum-computing quantum-machine-learning rocm

Updated 11 months ago

flamegpu2 • Rank 8.2 • Science 77%

FLAME GPU 2 is a GPU accelerated agent based modelling framework for CUDA C++ and Python

agent-based-modelling agent-based-simulation c-plus-plus cmake complex-systems cuda flamegpu flamegpu2 gpu modelling-agents simulation spatial-models

Updated 11 months ago

cupy • Rank 30.6 • Science 54%

NumPy & SciPy for GPU

cublas cuda cudnn cupy curand cusolver cusparse cusparselt cutensor gpu nccl numpy nvrtc nvtx python rocm scipy tensor

Updated 11 months ago

habitat • Rank 5.5 • Science 77%

🔮 Execution time predictions for deep neural network training iterations across different GPUs.

deep-learning deep-neural-networks gpu neural-networks performance performance-prediction

Updated 11 months ago

aluminum • Rank 8.4 • Science 72%

High-performance, GPU-aware communication library

cpp cuda gpu hpc mpi

Updated 11 months ago

tutorial-multi-gpu • Rank 8.1 • Science 72%

Efficient Distributed GPU Programming for Exascale, an SC/ISC Tutorial

cuda exascale-computing gpu hpc isc22 isc23 isc24 mpi multi-gpu nccl nvshmem sc21 sc22 sc23 supercomputing

Updated 11 months ago

Metal • Rank 15.3 • Science 64%

Metal programming in Julia

apple-gpu apple-silicon gpu hacktoberfest julia mac metal-framework

Updated 11 months ago

cuvec • Rank 11.7 • Science 67%

Unifying Python/C++/CUDA memory: Python buffered array ↔️ `std::vector` ↔️ CUDA managed memory

array buffer c cpp cpu cpython cpython-api cpython-extensions cuda cxx gpu hacktoberfest pybind11 python swig vector

Updated 11 months ago

cutlass • Rank 24.1 • Science 54%

CUDA Templates for Linear Algebra Subroutines

cpp cuda deep-learning deep-learning-library gpu nvidia

Updated 11 months ago

babelstream • Rank 10.7 • Science 67%

STREAM, for lots of devices written in many programming models

benchmark cuda gpgpu gpu hpc kokkos memory-bandwidth openacc opencl openmp parallel-processing raja sycl

Updated 11 months ago

oneAPI • Rank 12.9 • Science 64%

Julia support for the oneAPI programming toolkit.

gpu hacktoberfest julia oneapi

Updated 11 months ago

tiny-cuda-nn • Rank 11.8 • Science 64%

Lightning fast C++/CUDA neural network framework

cuda deep-learning gpu mlp nerf neural-network pytorch real-time rendering

Updated 11 months ago

fluidx3d • Rank 8.4 • Science 67%

The fastest and most memory efficient lattice Boltzmann CFD software, running on all GPUs and CPUs via OpenCL. Free for non-commercial use.

benchmark cfd computational-fluid-dynamics fluid-dynamics fluid-simulation fluid-solver gpgpu gpu gpu-computing high-performance-computing hpc interactive-visualization lattice-boltzmann lbm opencl physics raytracing scientific-computing scientific-visualization simulation

Updated 11 months ago

mlora-cli • Rank 11.3 • Science 64%

An Efficient "Factory" to Build Multiple LoRA Adapters

baichuan chatglm dpo finetune gpu llama llama2 llm lora mlora peft rlhf

Updated 11 months ago

gemmkernels.jl • Rank 7.8 • Science 67%

Flexible and performant GEMM kernels in Julia

cuda gpu julia

Updated 11 months ago

mosec • Rank 20.4 • Science 54%

A high-performance ML model serving framework, offers dynamic batching and CPU/GPU pipelines to fully exploit your compute machine

cv deep-learning gpu hacktoberfest jax llm llm-serving machine-learning machine-learning-platform mlops model-serving mxnet nerual-network python pytorch rust tensorflow tts

Updated 11 months ago

devito • Rank 19.3 • Science 54%

DSL and compiler framework for automated finite-differences and stencil computation

code-generation compiler dsl finite-difference fwi gpu hpc jit performance rtm stencil sympy ultrasound-imaging

Updated 11 months ago

RadonKA • Rank 5.2 • Science 67%

A simple yet sufficiently fast (attenuated) Radon and backproject implementation using KernelAbstractions.jl. Runs on CPU, CUDA, ...

automatic-differentiation computed-tomography ct cuda gpu julia julia-language optimization radon radon-transform tomography x-ray

Updated 11 months ago

BoundaryValueDiffEq • Rank 17.7 • Science 54%

Boundary value problem (BVP) solvers for scientific machine learning (SciML)

bvp differential-equations differentialequations gpu neural-bvp neural-differential-equations neural-ode scientific-machine-learning sciml

Updated 11 months ago

arbor • Rank 17.2 • Science 54%

The Arbor multi-compartment neural network simulation library.

cuda gpu hip hpc modern-cpp mpi neuroscience

Updated 11 months ago

cccl • Rank 17.2 • Science 54%

CUDA Core Compute Libraries

accelerated-computing cpp cpp-programming cuda cuda-cpp cuda-kernels cuda-library cuda-programming gpu gpu-acceleration gpu-computing gpu-programming hpc modern-cpp nvidia nvidia-gpu parallel-algorithm parallel-computing parallel-programming

Updated 11 months ago

pyhpc-benchmarks • Rank 6.9 • Science 64%

A suite of benchmarks for CPU and GPU performance of the most popular high-performance libraries for Python :rocket:

benchmarks cupy gpu high-performance-computing jax parallel-computing python pytorch tensorflow

Updated 11 months ago

nx • Rank 26.9 • Science 44%

Multi-dimensional arrays (tensors) and numerical definitions for Elixir

elixir gpu jit numerical pytorch tensor xla

Updated 11 months ago

k-wave-python • Rank 15.6 • Science 54%

A Python interface to k-Wave GPU accelerated binaries

acoustics gpu kwave neuroscience python simulation ultrasound wave-equation

Updated 11 months ago

pytorch-benchmark • Rank 11.9 • Science 57%

Easily benchmark PyTorch model FLOPs, latency, throughput, allocated gpu memory and energy consumption

benchmark deep-learning flops gpu jetson python pytorch timing-analysis

Updated 11 months ago

lc0 • Rank 12.7 • Science 54%

Open source neural network chess engine with GPU acceleration and broad hardware support.

alphazero alphazero-inspired chess chess-ai chess-engine cuda deep-learning deep-reinforcement-learning gpu leela-chess-zero neural-networks uci

Updated 11 months ago

DeconvOptim • Rank 7.5 • Science 57%

A multi-dimensional, high performance deconvolution framework written in Julia Lang for CPUs and GPUs.

deconvolution gpu image-processing julia microscopy

Updated 11 months ago

pytriton • Rank 20.0 • Science 44%

PyTriton is a Flask/FastAPI-like interface that simplifies Triton's deployment in Python environments.

deep-learning gpu inference

Updated 11 months ago

arborx • Rank 9.8 • Science 54%

Performance-portable geometric search library

bounding-volume-hierarchy c-plus-plus clustering cpp cuda dbscan distributed gpu hdbscan high-performance-computing hpc knn-search kokkos mpi nearest-neighbors parallel

Updated 11 months ago

exponentialutilities.jl • Rank 8.4 • Science 54%

Fast and differentiable implementations of matrix exponentials, Krylov exponential matrix-vector multiplications ("expmv"), KIOPS, ExpoKit functions, and more. All your exponential needs in SciML form.

differential-equations expmv expokit exponential gpu high-performance julia kiops krylov krylov-methods krylov-subspace-methods matrix-exponential matrix-exponentials scientific-machine-learning sciml

Updated 11 months ago

nvgraph.sh • Rank 5.1 • Science 54%

CLI for nvGraph, which is a GPU-based graph analytics library written by NVIDIA, using CUDA.

analytics cli console cuda gpu graph nvgraph nvidia pagerank terminal

Updated 11 months ago

triton-model-navigator • Rank 14.9 • Science 44%

Triton Model Navigator is an inference toolkit designed for optimizing and deploying Deep Learning models with a focus on NVIDIA GPUs.

deep-learning gpu inference

Updated 11 months ago

lpa-xrd • Rank 4.9 • Science 54%

Line profile analysis X-ray diffraction simulator.

automation diffraction gpu opencl python x-ray

Updated 11 months ago

beaver • Rank 14.7 • Science 44%

MLIR Toolkit in Elixir and Zig.

compiler elixir gpu mlir zig

Updated 11 months ago

librapid • Rank 14.6 • Science 44%

A highly optimised C++ library for mathematical applications and neural networks.

array cpp cpp20 cpp23 cuda gpu high-performance-computing library matrix multidimensional-arrays multithreading parallel-programming pypy pypy3 python python3 simd

Updated 11 months ago

matx • Rank 10.7 • Science 44%

An efficient C++17 GPU numerical computing library with Python-like syntax

cuda gpgpu gpu gpu-computing hpc

Updated 11 months ago

opencl-benchmark • Rank 5.5 • Science 44%

A small OpenCL benchmark program to measure peak GPU/CPU performance.

bandwidth benchmark benchmarking flops gpgpu gpu gpu-computing high-performance-computing hpc opencl tool tools

Updated 11 months ago

gpu-speedups • Rank 5.8 • Science 41%

GPU Speedups in Python

carpentries-incubator english gpu lesson pre-alpha python

Updated 11 months ago

text2vec-service • Rank 2.3 • Science 44%

Service for Bert model to Vector. 高效的文本转向量(Text-To-Vector)服务，支持GPU多卡、多worker、多客户端调用，开箱即用。

gpu nlp pytorch service

Updated 11 months ago

phoebe • Science 57%

A high-performance framework for solving phonon and electron Boltzmann equations

electrical-conductivity electron-phonon gpu kokkos materials-science physics thermal-conductivity thermoelectric

Updated 11 months ago

cudawrappers • Science 54%

C++ wrapper for the Nvidia/HIP C libraries (e.g. CUDA driver, nvrtc, hiprtc, cuFFT, hipFFT, etc.)

accelerator amd cpp cuda driver-api gpu gpu-computing hip library nvidia rocm

Updated 11 months ago

qrack • Science 67%

Comprehensive, GPU accelerated framework for developing universal virtual quantum processors

cuda distributed-quantum-computing gpu hpc integrated-graphics intel-hd-graphics near-clifford opencl physics physics-simulation quantum quantum-computer-simulator quantum-computing quantum-information quantum-simulator qubits

Updated 11 months ago

ml_with_aws_sagemaker • Science 44%

Learn how to scale up ML/AI pipelines using AWS SageMaker (GPUs, Cloud computing)

ai alpha artificial-intelligence aws carpentries-incubator cloud-computing english gpu lesson machine-learning ml neural-network open-source python pytorch sagemaker

Updated 11 months ago

llrs • Science 26%

The Low Latency Reconfiguration System

atom-reconfiguration-problems cpp cuda experimental-physics gpu neutral-atoms physics quantum-simulation tweezer-arrays

Updated 11 months ago

vector-sum-cuda • Science 54%

Comparing performance of sequential vs CUDA-based vector element sum.

cuda element experiment gpu sum vector

Updated 11 months ago

vkcompviz • Science 44%

Vulkan image and data processing framework capable of running a cascade of compute shaders and displaying or storing the result.

compute-shader gpgpu gpu render shader vulkan vulkan-api

Updated 11 months ago

hybridbackend • Science 54%

A high-performance framework for training wide-and-deep recommender systems on heterogeneous cluster

deep-learning gpu hybrid-parallelism parquet recommender-system

Updated 11 months ago

Lux • Science 65%

Elegant and Performant Deep Learning

deep-learning gpu machine-learning neural-networks scientific-machine-learning tpu xla

Updated 11 months ago

astro-accelerate • Science 67%

AstroAccelerate is a many-core accelerated software package for processing time-domain radio-astronomy data.

cuda gpu radio-astronomy

Updated 11 months ago

bundoora • Science 44%

Customized development container environment for consistent and efficient execution of machine learning projects.

ai cloud-native container deep-learning gpu pytorch transformer

Updated 11 months ago

interopunitycuda • Science 67%

Demonstrate interoperability between Unity Engine and CUDA

cpp cuda dx11 gpu gpu-acceleration native-plugin opengl unity unity3d

Updated 11 months ago

transformers-bart-pretrain • Science 54%

Script to pre-train hugginface transformers BART with Tensorflow 2

bart gpu huggingface-transformers pretraining tensorflow tpu

Updated 11 months ago

wakis • Science 67%

3D electromagnetic time-domain solver, specialized in wake potential and beam-coupling impedance computation for particle accelerators

3d accelerator-physics electromagnetic-simulation gpu impedance time-domain wakefield

Updated 11 months ago

server • Science 54%

The Triton Inference Server provides an optimized cloud and edge inferencing solution.

cloud datacenter deep-learning edge gpu inference machine-learning

Updated 11 months ago

datoviz • Science 54%

⚡ Datoviz: high-performance GPU rendering for scientific data visualization

c cpp data-visualization data-viz gpu graphics python rendering scientific-computing scientific-visualization visualization vulkan

Updated 11 months ago

Argos • Science 54%

Reduced-space optimization, for optimal power flow.

gpu julia optimization

Updated 11 months ago

intelliperf • Science 67%

Automated bottleneck detection and solution orchestration

amd genai gpu hip instinct llm performance rocm

Updated 11 months ago

futhark • Science 72%

:boom::computer::boom: A data-parallel functional programming language

boom compiler cuda futhark gpgpu gpu hacktoberfest hpc language opencl

Artificial Intelligence and Machine Learning

Updated 11 months ago

sycl-bench • Science 57%

SYCL Benchmark Suite

gpgpu gpu gpu-programming opencl spir-v sycl

Updated 11 months ago

MHDFlows • Science 67%

Three Dimensional Magnetohydrodynamic(MHD) pseudospectral solvers written in julia with FourierFlows.jl

fourierflows gpu julia magnetohydrodynamics pdes turbulence

Updated 11 months ago

superterrainplus • Science 44%

SuperTerrain+: A real-time procedural 3D infinite terrain engine with geographical features and photorealistic rendering.

3d 3d-graphics computer-graphics cpp cuda glsl gpu graphics opengl physics-simulation procedural-generation real-time rendering school-project terrain-generation