Updated 6 months ago

ck • Rank 19.4 • Science 77%

Collective Knowledge (CK), Collective Mind (CM/CMX) and MLPerf automations: community-driven projects to facilitate collaborative and reproducible research and to learn how to run AI, ML, and other emerging workloads more efficiently and cost-effectively across diverse models, datasets, software, and hardware using MLPerf methodology and benchmarks

Updated 6 months ago

active-learning-as-a-service • Rank 6.8 • Science 77%

A scalable & efficient active learning/data selection system for everyone.

Updated 6 months ago

bentoml • Rank 26.1 • Science 54%

The easiest way to serve AI apps and models - Build Model Inference APIs, Job queues, LLM apps, Multi-model pipelines, and more!

Updated 6 months ago

tensorzero • Rank 23.4 • Science 54%

TensorZero is an open-source stack for industrial-grade LLM applications. It unifies an LLM gateway, observability, optimization, evaluation, and experimentation.

Updated 6 months ago

deepchecks • Rank 22.8 • Science 54%

Deepchecks: Tests for Continuous Validation of ML Models & Data. Deepchecks is a holistic open-source solution for all of your AI & ML validation needs, enabling to thoroughly test your data and models from research to production.

Updated 6 months ago

mosec • Rank 20.4 • Science 54%

A high-performance ML model serving framework, offers dynamic batching and CPU/GPU pipelines to fully exploit your compute machine

Updated 6 months ago

mlflow • Rank 35.0 • Science 36%

The open source developer platform to build AI/LLM applications and models with confidence. Enhance your AI applications with end-to-end tracking, observability, and evaluations, all in one integrated platform.

Updated 6 months ago

kedro • Rank 14.8 • Science 54%

Kedro is a toolbox for production-ready data science. It uses software engineering best practices to help you create data engineering and data science pipelines that are reproducible, maintainable, and modular.

Updated 6 months ago

weaviate • Rank 24.2 • Science 44%

Weaviate is an open-source vector database that stores both objects and vectors, allowing for the combination of vector search with structured filtering with the fault tolerance and scalability of a cloud-native database​.

Updated 6 months ago

openllm • Rank 22.1 • Science 44%

Run any open-source LLMs, such as DeepSeek and Llama, as OpenAI compatible API endpoint in the cloud.

Updated 6 months ago

https://github.com/lancedb/lance • Rank 28.1 • Science 36%

Modern columnar data format for ML and LLMs implemented in Rust. Convert from parquet in 2 lines of code for 100x faster random access, vector index, and data versioning. Compatible with Pandas, DuckDB, Polars, Pyarrow, and PyTorch with more integrations coming..

Updated 6 months ago

deeplake • Rank 25.6 • Science 36%

Database for AI. Store Vectors, Images, Texts, Videos, etc. Use with LLMs/LangChain. Store, query, version, & visualize any AI data. Stream data in real-time to PyTorch/TensorFlow. https://activeloop.ai

Updated 4 months ago

https://github.com/deepset-ai/haystack-core-integrations • Rank 22.9 • Science 36%

Additional packages (components, document stores and the likes) to extend the capabilities of Haystack

Updated 6 months ago

state-of-open-source-ai • Rank 10.3 • Science 44%

:closed_book: Clarity in the current fast-paced mess of Open Source innovation

Updated 6 months ago

https://github.com/apache/hamilton • Rank 12.1 • Science 36%

Apache Hamilton helps data scientists and engineers define testable, modular, self-documenting dataflows, that encode lineage/tracing and metadata. Runs and scales everywhere python does.

Updated 6 months ago

monai-deploy • Rank 8.0 • Science 36%

MONAI Deploy aims to become the de-facto standard for developing, packaging, testing, deploying and running medical AI applications in clinical production.

Updated 6 months ago

vetiver • Rank 15.1 • Science 26%

Version, share, deploy, and monitor models.

Updated 6 months ago

https://github.com/thebabylonai/babylog • Rank 9.1 • Science 23%

A lightweight logger for machine learning teams to log images and predictions in production.

Updated 6 months ago

https://github.com/thenewflesh/hidebound • Rank 5.7 • Science 26%

Hidebound is massive, distributed digital asset management system for ML pipelines on Kubernetes

Updated 6 months ago

https://github.com/neptune-ai/neptune-notebooks • Rank 14.9 • Science 13%

📚 Jupyter Notebooks extension for versioning, managing and sharing notebook checkpoints in your machine learning and data science projects.

Updated 6 months ago

https://github.com/ploomber/soorgeon • Rank 14.4 • Science 13%

Convert monolithic Jupyter notebooks 📙 into maintainable Ploomber pipelines. 📊

Updated 6 months ago

https://github.com/whylabs/whylogs • Rank 13.4 • Science 13%

An open-source data logging library for machine learning models and data pipelines. 📚 Provides visibility into data quality & model performance over time. 🛡️ Supports privacy-preserving data collection, ensuring safety & robustness. 📈

Updated 5 months ago

https://github.com/bentoml/clip-api-service • Rank 9.2 • Science 13%

CLIP as a service - Embed image and sentences, object recognition, visual reasoning, image classification and reverse image search

Updated 6 months ago

https://github.com/agnostiqhq/covalent-cloud-github-workflow • Rank 3.2 • Science 13%

Template for integrating Covalent Cloud's high-performance computing capabilities into GitHub Workflows

Updated 6 months ago

trustpy-tools • Science 44%

TrustPy is a production-ready Python package purpose-built for MLOps pipelines—enabling automated, interpretable analysis of model trustworthiness and predictive reliability before deployment. Available via Conda-Forge and PyPI, with full CI/CD integration and seamless compatibility across modern ML stacks.

Updated 5 months ago

https://github.com/awslabs/aiops-modules • Science 26%

AIOps modules is a collection of reusable Infrastructure as Code (IaC) modules for Machine Learning (ML), Foundation Models (FM), Large Language Models (LLM) and GenAI development and operations on AWS

Updated 5 months ago

https://github.com/bentoml/plugins • Science 13%

the swish knife to all things bentoml.

Updated 5 months ago

https://github.com/amr-yasser226/customer-churn-prediction • Science 26%

End-to-end customer churn prediction project: dataset preparation, experiments with scikit-learn, model tracking with MLflow, data versioning (DVC), CI/CD, and deployment examples.

Updated 6 months ago

https://github.com/adalkiran/distributed-inference • Science 13%

A project to demonstrate an approach to designing cross-language and distributed pipeline in deep learning/machine learning domain, using WebRTC and Redis Streams.

Updated 6 months ago

deeptsf • Science 57%

The DeepTSF time series forecasting repository developed by EPU NTUA within the DeployAI project

Updated 6 months ago

csc-mlops • Science 44%

Framework for building ML apps

Updated 5 months ago

https://github.com/amilworks/argocd-demo • Science 13%

Guide to Getting Started with ArgoCD

Updated 5 months ago

https://github.com/bentoml/transformers-nlp-service • Science 13%

Online Inference API for NLP Transformer models - summarization, text classification, sentiment analysis and more

Updated 6 months ago

glide • Science 44%

🐦 A open blazing-fast simple model gateway for rapid development of production GenAI apps

Updated 6 months ago

agilerl • Science 54%

Streamlining reinforcement learning with RLOps. State-of-the-art RL algorithms and tools, with 10x faster training through evolutionary hyperparameter optimization.