imodels
imodels: a python package for fitting interpretable models - Published in JOSS (2021)
modelStudio
modelStudio: Interactive Studio with Explanations for ML Predictive Models - Published in JOSS (2019)
FAT Forensics
FAT Forensics: A Python Toolbox for Implementing and Deploying Fairness, Accountability and Transparency Algorithms in Predictive Systems - Published in JOSS (2020)
ExpFamilyPCA.jl
ExpFamilyPCA.jl: A Julia Package for Exponential Family Principal Component Analysis - Published in JOSS (2025)
TSInterpret
TSInterpret: A Python Package for the Interpretability of Time Series Classification - Published in JOSS (2023)
quantus
Quantus is an eXplainable AI toolkit for responsible evaluation of neural network explanations
shap
A game theoretic approach to explain the output of any machine learning model.
corelay
CoRelAy is a tool to compose small-scale (single-machine) analysis pipelines.
transformers-interpret
Model explainability that works seamlessly with 🤗 transformers. Explain your transformers model in just 2 lines of code.
torchcam
Class activation maps for your PyTorch models (CAM, Grad-CAM, Grad-CAM++, Smooth Grad-CAM++, Score-CAM, SS-CAM, IS-CAM, XGrad-CAM, Layer-CAM)
virelay
ViRelAy is a visualization tool for the analysis of data as generated by CoRelAy.
yggdrasil-decision-forests
A library to train, evaluate, interpret, and productionize decision forest models such as Random Forest and Gradient Boosted Decision Trees.
grad-cam
Advanced AI Explainability for computer vision. Support for CNNs, Vision Transformers, Classification, Object detection, Segmentation, Image similarity and more.
multi-mode-cnn-pytorch
A PyTorch implementation of the Multi-Mode CNN to reconstruct Chlorophyll-a time series in the global ocean from oceanic and atmospheric physical drivers
asent
Asent is a python library for performing efficient and transparent sentiment analysis using spaCy.
https://github.com/csinva/tree-prompt
Tree prompting: easy-to-use scikit-learn interface for improved prompting.
hierarchical-dnn-interpretations
Using / reproducing ACD from the paper "Hierarchical interpretations for neural network predictions" 🧠 (ICLR 2019)
iBreakDown
Break Down with interactions for local explanations (SHAP, BreakDown, iBreakDown)
https://github.com/csinva/disentangled-attribution-curves
Using / reproducing DAC from the paper "Disentangled Attribution Curves for Interpreting Random Forests and Boosted Trees"
https://github.com/google-research/reverse-engineering-neural-networks
A collection of tools for reverse engineering neural networks.
live
Local Interpretable (Model-agnostic) Visual Explanations - model visualization for regression problems and tabular data based on LIME method. Available on CRAN
tree.interpreter
Decision tree interpreter for randomForest/ranger as described in
coefeasy
Coefeasy is an R package under development for making regression coefficients more accessible. With this tool, you can read and report key coefficients instantly.
icml19-egocnn
Code for "Distributed, Egocentric Representations of Graphs for Detecting Critical Structures" (ICML 2019)
pnet_robustness
Reliable interpretability of biology-inspired deep neural networks
symbolic-governed-mistral-artifact
Tier-10 sealed governance artifact for Mistral-7B with exact-match benchmarks and symbolic verifier.
https://github.com/csinva/transformation-importance
Using / reproducing TRIM from the paper "Transformation Importance with Applications to Cosmology" 🌌 (ICLR Workshop 2020)
scene-representation-diffusion-model
Linear probe found representations of scene attributes in a text-to-image diffusion model
https://github.com/berenslab/interpretable-deep-survival-analysis
An interpretable end-to-end CNN for disease progression modeling that predicts late AMD onset (MICCAI 2024)
attention-meets-perturbation
📝 Official Implementation of "Attention Meets Perturbation: Robust and Interpretable Attention with Adversarial Training"
https://github.com/csinva/clinical-rule-development
Building and vetting clinical decision rules.
captum-tutorials
contrastiveexplanation
Contrastive Explanation (Foil Trees), developed at TNO/Utrecht University
https://github.com/birkhoffg/explainable-ml-papers
A list of research papers of explainable machine learning.
understanding-clip-ood
Official code for the paper: "When and How Does CLIP Enable Domain and Compositional Generalization?" (ICML 2025 Spotlight)
automated-brain-explanations
Generating and validating natural-language explanations for the brain.
https://github.com/csinva/iprompt
Finding semantically meaningful and accurate prompts.
awesome-attention-heads
An awesome repository & A comprehensive survey on interpretability of LLM attention heads.
bcf-iv
Package for heterogeneous causal effects in the presence of imperfect compliance (e.g., instrumental variables, fuzzy regression discontinuity designs)
https://github.com/kundajelab/fastism
In-silico Saturation Mutagenesis implementation with 10x or more speedup for certain architectures.
https://github.com/betswish/mirage
Easy-to-use MIRAGE code for faithful answer attribution in RAG applications. Paper: https://aclanthology.org/2024.emnlp-main.347/
awesome-production-machine-learning
A curated list of awesome open source libraries to deploy, monitor, version and scale your machine learning
talktomodel
TalkToModel gives anyone with the powers of XAI through natural language conversations 💬!