imodels
imodels: a python package for fitting interpretable models - Published in JOSS (2021)
PyCM
PyCM: Multiclass confusion matrix library in Python - Published in JOSS (2018)
Machine Learning Validation via Rational Dataset Sampling with astartes
Machine Learning Validation via Rational Dataset Sampling with astartes - Published in JOSS (2023)
VeridicalFlow
VeridicalFlow: a Python package for building trustworthy data science pipelines with PCS - Published in JOSS (2022)
Choice-Learn
Choice-Learn: Large-scale choice modeling for operational contexts through the lens of machine learning - Published in JOSS (2024)
BetaML
BetaML: The Beta Machine Learning Toolkit, a self-contained repository of Machine Learning algorithms in Julia - Published in JOSS (2021)
irl-imitation
Implementation of Inverse Reinforcement Learning (IRL) algorithms in Python/Tensorflow. Deep MaxEnt, MaxEnt, LPIRL
tensorzero
TensorZero is an open-source stack for industrial-grade LLM applications. It unifies an LLM gateway, observability, optimization, evaluation, and experimentation.
deepchecks
Deepchecks: Tests for Continuous Validation of ML Models & Data. Deepchecks is a holistic open-source solution for all of your AI & ML validation needs, enabling to thoroughly test your data and models from research to production.
mlflow
The open source developer platform to build AI/LLM applications and models with confidence. Enhance your AI applications with end-to-end tracking, observability, and evaluations, all in one integrated platform.
yggdrasil-decision-forests
A library to train, evaluate, interpret, and productionize decision forest models such as Random Forest and Gradient Boosted Decision Trees.
polyaxon
MLOps Tools For Managing & Orchestrating The Machine Learning LifeCycle
@stdlib/ml-incr-binary-classification
Incrementally perform binary classification using stochastic gradient descent (SGD).
https://github.com/biomedsciai/causallib
A Python package for modular causal inference analysis and model evaluations
deeplake
Database for AI. Store Vectors, Images, Texts, Videos, etc. Use with LLMs/LangChain. Store, query, version, & visualize any AI data. Stream data in real-time to PyTorch/TensorFlow. https://activeloop.ai
k-means-constrained
K-Means clustering - constrained with minimum and maximum cluster size. Documentation: https://joshlk.github.io/k-means-constrained
nerlnet
Nerlnet is a framework for research and development of distributed machine learning models on IoT
https://github.com/zenml-io/zenml
ZenML 🙏: MLOps for Reliable AI: from Classical AI to Agents. https://zenml.io.
structure-seer
The implementation, training and evaluation of a Structure Seer machine learning model designed for reconstruction of adjacency of a molecular graph from the labelling of its nodes.
iamai
A rule-driven comprehensive AI toolkit emphasizing simultaneous support for multimodal machine learning and the ability to construct cross-platform robots using logic.(规则驱动式的综合性人工智能工具库,强调同时支持多模态机器学习和利用逻辑构建跨平台机器人的能力)
state-of-open-source-ai
:closed_book: Clarity in the current fast-paced mess of Open Source innovation
https://github.com/csinva/csinva.github.io
Slides, paper notes, class notes, blog posts, and research on ML 📉, statistics 📊, and AI 🤖.
https://github.com/featureform/featureform
The Virtual Feature Store. Turn your existing data infrastructure into a feature store.
https://github.com/google/dopamine
Dopamine is a research framework for fast prototyping of reinforcement learning algorithms.
@stdlib/ml-incr-sgd-regression
Online regression via stochastic gradient descent (SGD).
@stdlib/datasets-suthaharan-single-hop-sensor-network
Labeled wireless sensor network data set collected from a simple single-hop wireless sensor network deployment using TelosB motes.
@stdlib/datasets-suthaharan-multi-hop-sensor-network
Labeled wireless sensor network data set collected from a multi-hop wireless sensor network deployment using TelosB motes.
hierarchical-dnn-interpretations
Using / reproducing ACD from the paper "Hierarchical interpretations for neural network predictions" 🧠 (ICLR 2019)
netron
Visualizer for neural network, deep learning and machine learning models
https://github.com/superduper-io/superduper
Superduper: End-to-end framework for building custom AI applications and agents.
https://github.com/sematic-ai/sematic
An open-source ML pipeline development platform
rgf
Home repository for the Regularized Greedy Forest (RGF) library. It includes original implementation from the paper and multithreaded one written in C++, along with various language-specific wrappers.
https://github.com/csinva/gan-vae-pretrained-pytorch
Pretrained GANs + VAEs + classifiers for MNIST/CIFAR in pytorch.
https://github.com/csinva/matching-with-gans
Matching in GAN latent space for better bias benchmarking and semantic image editing. 👶🏻🧒🏾👩🏼🦰👱🏽♂️👴🏾
https://github.com/csinva/disentangled-attribution-curves
Using / reproducing DAC from the paper "Disentangled Attribution Curves for Interpreting Random Forests and Boosted Trees"
https://github.com/csinva/analyzing-patient-perspectives
Analyzing interview data from the PediDOSE EFIC interviews using LLMs.
quickai
QuickAI is a Python library that makes it extremely easy to experiment with state-of-the-art Machine Learning models.
https://github.com/kyegomez/gemini
The open source implementation of Gemini, the model that will "eclipse ChatGPT" by Google
https://github.com/thebabylonai/babylog
A lightweight logger for machine learning teams to log images and predictions in production.
NYISOToolkit
Access data, statistics, and visualizations for New York's electricity grid.
https://github.com/drsoliddevil/mlr-gd
Multiple linear regression by gradient descent.
https://github.com/neptune-ai/neptune-notebooks
📚 Jupyter Notebooks extension for versioning, managing and sharing notebook checkpoints in your machine learning and data science projects.
https://github.com/rindow/rindow-neuralnetworks
Neural networks library for machine learning on PHP
https://github.com/csinva/gpt-paper-title-generator
Generating paper titles (and more!) with GPT trained on data scraped from arXiv.
https://github.com/csinva/dnn-ensemble
Testing the properties of ensembled neural networks.
https://github.com/raptor-ml/raptor
Transform your pythonic research to an artifact that engineers can deploy easily.
https://github.com/dair-ai/ml-nlp-paper-discussions
📄 A repo containing notes and discussions for our weekly NLP/ML paper discussions.
osdg-tool
OSDG is an open-source tool that maps and connects activities to the UN Sustainable Development Goals (SDGs) by identifying SDG-relevant content in any text. The tool is available online at www.osdg.ai. API access available for research purposes.
https://github.com/csbiology/chlamyatlas
Chlamy Atlas is a AI-powered web application which predicts the localizations of proteins from the Green Algae Chlamydomonas reinhardtii.
https://github.com/agnostiqhq/tutorials_covalent_mlops_2022
Covalent tutorial for MLOps 2022
https://github.com/ccomkhj/lightening_classifier
PyTorch Lightning wrapper to make training classifiers easier.
motor_fault_detection_brb
Motor fault detection with ML using SVM, GBM, and ANN.
https://github.com/scimorph/secureml
Easy-to-use utilities to build privacy-preserving AI.
briscolabot
Reinforcement Learning agent that plays Briscola, a famous Italian card game
openspeaks-before-ai
A set of frameworks for creating the AI/ML building blocks for low-resource languages.
machine-learning-neural-python
Introduction to artificial neural nets with Python
intro-image-classification-cnn
new lesson on image classification with convolutional neural networks
https://github.com/adalkiran/distributed-inference
A project to demonstrate an approach to designing cross-language and distributed pipeline in deep learning/machine learning domain, using WebRTC and Redis Streams.
lecture-ai-basics
Course content for the elective Artificial Intelligence I, covering foundational AI concepts and applied exercises.
stac-model
STAC Machine Learning Model (MLM) Extension to describe ML models, their training details, and inference runtime requirements.
https://github.com/usnistgov/chipsff
Evaluation of universal machine learning force-fields https://doi.org/10.1021/acsmaterialslett.5c00093
gnn_tracking
Reconstruct billions of particle trajectories with graph neural networks
tour-championship-strokes-gained-analysis
Developed an expected strokes model to identify player performance for the 2011 PGA TOUR Championship
https://github.com/graphbookai/graphbook
Visual AI development framework for training and inference of ML models, scaling pipelines, and automating workflows with Python.⭐ Leave a star to support us!
https://github.com/csinva/transformation-importance
Using / reproducing TRIM from the paper "Transformation Importance with Applications to Cosmology" 🌌 (ICLR Workshop 2020)
ml_with_aws_sagemaker
Learn how to scale up ML/AI pipelines using AWS SageMaker (GPUs, Cloud computing)
insect-detect
Detection models and Python scripts for automated insect monitoring with the Insect Detect DIY camera trap.