imodels
imodels: a python package for fitting interpretable models - Published in JOSS (2021)
PyCM
PyCM: Multiclass confusion matrix library in Python - Published in JOSS (2018)
Machine Learning Validation via Rational Dataset Sampling with astartes
Machine Learning Validation via Rational Dataset Sampling with astartes - Published in JOSS (2023)
DSSE
DSSE: An environment for simulation of reinforcement learning-empowered drone swarm maritime search and rescue missions - Published in JOSS (2024)
VeridicalFlow
VeridicalFlow: a Python package for building trustworthy data science pipelines with PCS - Published in JOSS (2022)
LangFair
LangFair: A Python Package for Assessing Bias and Fairness in Large Language Model Use Cases - Published in JOSS (2025)
modelStudio
modelStudio: Interactive Studio with Explanations for ML Predictive Models - Published in JOSS (2019)
FuseMedML
FuseMedML: a framework for accelerated discovery in machine learning based biomedicine - Published in JOSS (2023)
BetaML
BetaML: The Beta Machine Learning Toolkit, a self-contained repository of Machine Learning algorithms in Julia - Published in JOSS (2021)
RT-utils
RT-utils: A Minimal Python Library for RT-struct Manipulation - Published in JOSS (2025)
lazyllm-llamafactory
Unified Efficient Fine-Tuning of 100+ LLMs & VLMs (ACL 2024)
LGP
LGP: A robust Linear Genetic Programming implementation on the JVM using Kotlin. - Published in JOSS (2019)
datasets
🤗 The largest hub of ready-to-use datasets for AI models with fast, easy-to-use and efficient data manipulation tools
skpro
A unified framework for tabular probabilistic regression, time-to-event prediction, and probability distributions in python
chatAI4R: Interactive Artificial Intelligence toolkit for Data Science in R
chatAI4R: Interactive Artificial Intelligence toolkit for Data Science in R - Published in JOSS (2026)
maidr-legacy
[DEPRECATED prototype] Multimodal Access and Interactive Data Representation
https://github.com/hpcaitech/colossalai
Making large AI models cheaper, faster and more accessible
pytorch-lightning
Pretrain, finetune ANY AI model of ANY size on multiple GPUs, TPUs with zero code changes.
litgpt
20+ high-performance LLMs with recipes to pretrain, finetune and deploy at scale.
txtai
💡 All-in-one open-source AI framework for semantic search, LLM orchestration and language model workflows
torchrl
A modular, primitive-first, python-first PyTorch library for Reinforcement Learning.
matsciml
Open MatSci ML Toolkit is a framework for prototyping and scaling out deep learning models for materials discovery supporting widely used materials science datasets, and built on top of PyTorch Lightning, the Deep Graph Library, and PyTorch Geometric.
https://github.com/recommenders-team/recommenders
Best Practices on Recommendation Systems
farm-haystack
AI orchestration framework to build customizable, production-ready LLM applications. Connect components (models, vector DBs, file converters) to pipelines or agents that can interact with your data. With advanced retrieval methods, it's best suited for building RAG, question answering, semantic search or conversational agent chatbots.
gpt-researcher
LLM based autonomous agent that conducts deep local and web research on any topic and generates a long report with citations.
pandas-ai
Chat with your database or your datalake (SQL, CSV, parquet). PandasAI makes data analysis conversational using LLMs and RAG.
SciMLBenchmarks
Scientific machine learning (SciML) benchmarks, AI for science, and (differential) equation solvers. Covers Julia, Python (PyTorch, Jax), MATLAB, R
fairlearn
A Python package to assess and improve fairness of machine learning models.
tensorzero
TensorZero is an open-source stack for industrial-grade LLM applications. It unifies an LLM gateway, observability, optimization, evaluation, and experimentation.
swarms
The Enterprise-Grade Production-Ready Multi-Agent Orchestration Framework. Website: https://swarms.ai
acttensor-tf
ActTensor: Activation Functions for TensorFlow. https://pypi.org/project/ActTensor-tf/ Authors: Pouya Ardehkhani, Pegah Ardehkhani
harmony
The Harmony Python library: a research tool for psychologists to harmonise data and questionnaire items. Open source.
grand-challenge.org
A platform for end-to-end development of machine learning solutions in biomedical imaging
https://github.com/sktime/pytorch-forecasting
Time series forecasting with PyTorch
mlflow
The open source developer platform to build AI/LLM applications and models with confidence. Enhance your AI applications with end-to-end tracking, observability, and evaluations, all in one integrated platform.
reductstore
High Performance Storage and Streaming Solution for Data Acquisition Systems
llms-from-scratch
Implement a ChatGPT-like LLM in PyTorch from scratch, step by step
aif360
A comprehensive set of fairness metrics for datasets and machine learning models, explanations for these metrics, and algorithms to mitigate bias in datasets and models.
thinc
🔮 A refreshing functional take on deep learning, compatible with your favorite libraries
https://github.com/facebookresearch/habitat-lab
A modular high-level library to train embodied AI agents across a variety of tasks and environments.
distilabel
Distilabel is a framework for synthetic data and AI feedback for engineers who need fast, reliable and scalable pipelines based on verified research papers.
beeai-framework
Build production-ready AI agents in both Python and Typescript.
multilspy
multilspy is a lsp client library in Python intended to be used to build applications around language servers.
basic-memory
AI conversations that actually remember. Never re-explain your project to Claude again. Local-first, integrates with Obsidian. Join our Discord: https://discord.gg/tyvKNccgqN
https://github.com/allenai/allenact
An open source framework for research in Embodied-AI from AI2.
eco2ai
eco2AI is a python library which accumulates statistics about power consumption and CO2 emission during running code.
linear-relational
Linear Relational Embeddings (LREs) and Linear Relational Concepts (LRCs) for LLMs in PyTorch
@llm-tools/embedjs
A NodeJS RAG framework to easily work with LLMs and embeddings
deeplake
Database for AI. Store Vectors, Images, Texts, Videos, etc. Use with LLMs/LangChain. Store, query, version, & visualize any AI data. Stream data in real-time to PyTorch/TensorFlow. https://activeloop.ai
monitors4codegen
Code and Data artifact for NeurIPS 2023 paper - "Monitor-Guided Decoding of Code LMs with Static Analysis of Repository Context". `multispy` is a lsp client library in Python intended to be used to build applications around language servers.
nerlnet
Nerlnet is a framework for research and development of distributed machine learning models on IoT
https://github.com/dair-ai/ml-papers-of-the-week
🔥Highlighting the top ML papers every week.
https://github.com/khoj-ai/khoj
Your AI second brain. Self-hostable. Get answers from the web or your docs. Build custom agents, schedule automations, do deep research. Turn any online or local LLM into your personal, autonomous AI (gpt, claude, gemini, llama, qwen, mistral). Get started - free.
https://github.com/carla-simulator/carla
Open-source simulator for autonomous driving research.
https://github.com/zenml-io/zenml
ZenML 🙏: MLOps for Reliable AI: from Classical AI to Agents. https://zenml.io.
https://github.com/floneum/floneum
Instant, controllable, local pre-trained AI models in Rust
iamai
A rule-driven comprehensive AI toolkit emphasizing simultaneous support for multimodal machine learning and the ability to construct cross-platform robots using logic.(规则驱动式的综合性人工智能工具库,强调同时支持多模态机器学习和利用逻辑构建跨平台机器人的能力)
docling4j
Docling4j brings the functionalities of Docling in document understanding to Java® projects