HPX - The C++ Standard Library for Parallelism and Concurrency
HPX - The C++ Standard Library for Parallelism and Concurrency - Published in JOSS (2020)
GridapDistributed
GridapDistributed: a massively parallel finite element toolbox in Julia - Published in JOSS (2022)
airflow-provider-vineyard
vineyard (v6d): an in-memory immutable data manager. (Project under CNCF, TAG-Storage)
heat
Distributed tensors and Machine Learning framework with GPU and MPI acceleration in Python
code2vec
TensorFlow code for the neural network presented in the paper: "code2vec: Learning Distributed Representations of Code"
ray
Ray is an AI compute engine. Ray consists of a core distributed runtime and a set of AI Libraries for accelerating ML workloads.
lightgbm
A fast, distributed, high performance gradient boosting (GBT, GBDT, GBRT, GBM or MART) framework based on decision tree algorithms, used for ranking, classification and many other machine learning tasks.
com.arcadedb:arcadedb-console
ArcadeDB Multi-Model Database, one DBMS that supports SQL, Cypher, Gremlin, HTTP/JSON, MongoDB and Redis. ArcadeDB is a conceptual fork of OrientDB, the first Multi-Model DBMS. ArcadeDB supports Vector Embeddings.
drasyl
drasyl-java is a high-performance framework for rapid development of distributed applications
h2o
H2O is an Open Source, Distributed, Fast & Scalable Machine Learning Platform: Deep Learning, Gradient Boosting (GBM) & XGBoost, Random Forest, Generalized Linear Modeling (GLM with Elastic Net), K-Means, PCA, Generalized Additive Models (GAM), RuleFit, Support Vector Machine (SVM), Stacked Ensembles, Automatic Machine Learning (AutoML), etc.
https://github.com/bacalhau-project/bacalhau
Community-driven, simple, yet powerful framework for fast, cost-effective distributed Compute over Data.
fugue
A unified interface for distributed computing. Fugue executes SQL, Python, Pandas, and Polars code on Spark, Dask and Ray without any rewrites.
https://github.com/carv-ics-forth/tebis
An efficient distributed key value store for fast storage devices and RDMA networks
https://github.com/austinjhunt/petrinet-webgme-designstudio
A custom design studio for modeling and simulating distributed systems as Petri (place/transition) Nets, built with JointJS, WebGME, and NodeJS.
https://github.com/ayaanhossain/sharedb
An on-disk pythonic embedded key-value store for compressed data storage and distributed data analysis
https://github.com/ganweisoft/gateway
Gateway is a high-performance, centralized communication and scheduling module for various device plugins. It uniformly converts heterogeneous data into standardized models and delivers core functionalities such as real-time data storage, alarm triggering, linkage control, and task planning.
https://github.com/plynx-team/plynx
PLynx is a domain agnostic platform for managing reproducible experiments and data-oriented workflows.
https://github.com/lamastex/scadamale
Scalable Data Science and Distributed Machine Learning Course Book written by Raazesh Sainudiin and his WASP AI-Track PhD Students
https://github.com/baggepinnen/lazywavfiles.jl
Lazily treat wav (audio) files as arrays. Arrays can be distributed over many wav files.
https://github.com/adalkiran/distributed-inference
A project to demonstrate an approach to designing cross-language and distributed pipeline in deep learning/machine learning domain, using WebRTC and Redis Streams.
turboprune
Harness for training/finding lottery tickets in PyTorch. With support for multiple pruning techniques and augmented by distributed training, FFCV and AMP.
agilerl
Streamlining reinforcement learning with RLOps. State-of-the-art RL algorithms and tools, with 10x faster training through evolutionary hyperparameter optimization.
https://github.com/dptech-corp/uni-core
an efficient distributed PyTorch framework
https://github.com/greptimeteam/greptimedb
Open-source, cloud-native, unified observability database for metrics, logs and traces, supporting SQL/PromQL/Streaming. Available on GreptimeCloud.
iris
Iris - P2P System for Confidential Sharing of Threat Intelligence and Collaborative Defense for Slips
https://github.com/austinjhunt/vanderbiltcs6381-assignment1-zmqpubsub
This project offers a framework for spinning up a publish/subscribe system either on a single host or on a virtualized network with a tool like Mininet.
propulate
Propulate is an asynchronous population-based optimization algorithm and software package for global optimization and hyperparameter search on high-performance computers.