imodels
imodels: a python package for fitting interpretable models - Published in JOSS (2021)
modelStudio
modelStudio: Interactive Studio with Explanations for ML Predictive Models - Published in JOSS (2019)
FAT Forensics
FAT Forensics: A Python Toolbox for Implementing and Deploying Fairness, Accountability and Transparency Algorithms in Predictive Systems - Published in JOSS (2020)
ExpFamilyPCA.jl
ExpFamilyPCA.jl: A Julia Package for Exponential Family Principal Component Analysis - Published in JOSS (2025)
TSInterpret
TSInterpret: A Python Package for the Interpretability of Time Series Classification - Published in JOSS (2023)
quantus
Quantus is an eXplainable AI toolkit for responsible evaluation of neural network explanations
shap
A game theoretic approach to explain the output of any machine learning model.
corelay
CoRelAy is a tool to compose small-scale (single-machine) analysis pipelines.
transformers-interpret
Model explainability that works seamlessly with 🤗 transformers. Explain your transformers model in just 2 lines of code.
torchcam
Class activation maps for your PyTorch models (CAM, Grad-CAM, Grad-CAM++, Smooth Grad-CAM++, Score-CAM, SS-CAM, IS-CAM, XGrad-CAM, Layer-CAM)
virelay
ViRelAy is a visualization tool for the analysis of data as generated by CoRelAy.
yggdrasil-decision-forests
A library to train, evaluate, interpret, and productionize decision forest models such as Random Forest and Gradient Boosted Decision Trees.
grad-cam
Advanced AI Explainability for computer vision. Support for CNNs, Vision Transformers, Classification, Object detection, Segmentation, Image similarity and more.
multi-mode-cnn-pytorch
A PyTorch implementation of the Multi-Mode CNN to reconstruct Chlorophyll-a time series in the global ocean from oceanic and atmospheric physical drivers
asent
Asent is a python library for performing efficient and transparent sentiment analysis using spaCy.
https://github.com/csinva/tree-prompt
Tree prompting: easy-to-use scikit-learn interface for improved prompting.
hierarchical-dnn-interpretations
Using / reproducing ACD from the paper "Hierarchical interpretations for neural network predictions" 🧠 (ICLR 2019)
iBreakDown
Break Down with interactions for local explanations (SHAP, BreakDown, iBreakDown)
https://github.com/csinva/disentangled-attribution-curves
Using / reproducing DAC from the paper "Disentangled Attribution Curves for Interpreting Random Forests and Boosted Trees"
https://github.com/google-research/reverse-engineering-neural-networks
A collection of tools for reverse engineering neural networks.
live
Local Interpretable (Model-agnostic) Visual Explanations - model visualization for regression problems and tabular data based on LIME method. Available on CRAN
tree.interpreter
Decision tree interpreter for randomForest/ranger as described in
https://github.com/csinva/transformation-importance
Using / reproducing TRIM from the paper "Transformation Importance with Applications to Cosmology" 🌌 (ICLR Workshop 2020)
understanding-clip-ood
Official code for the paper: "When and How Does CLIP Enable Domain and Compositional Generalization?" (ICML 2025 Spotlight)
attention-meets-perturbation
📝 Official Implementation of "Attention Meets Perturbation: Robust and Interpretable Attention with Adversarial Training"
https://github.com/birkhoffg/explainable-ml-papers
A list of research papers of explainable machine learning.
coefeasy
Coefeasy is an R package under development for making regression coefficients more accessible. With this tool, you can read and report key coefficients instantly.
talktomodel
TalkToModel gives anyone with the powers of XAI through natural language conversations 💬!
awesome-attention-heads
An awesome repository & A comprehensive survey on interpretability of LLM attention heads.
https://github.com/kundajelab/fastism
In-silico Saturation Mutagenesis implementation with 10x or more speedup for certain architectures.
icml19-egocnn
Code for "Distributed, Egocentric Representations of Graphs for Detecting Critical Structures" (ICML 2019)
automated-brain-explanations
Generating and validating natural-language explanations for the brain.
captum-tutorials
pnet_robustness
Reliable interpretability of biology-inspired deep neural networks
https://github.com/berenslab/interpretable-deep-survival-analysis
An interpretable end-to-end CNN for disease progression modeling that predicts late AMD onset (MICCAI 2024)
awesome-production-machine-learning
A curated list of awesome open source libraries to deploy, monitor, version and scale your machine learning
https://github.com/csinva/iprompt
Finding semantically meaningful and accurate prompts.
symbolic-governed-mistral-artifact
Tier-10 sealed governance artifact for Mistral-7B with exact-match benchmarks and symbolic verifier.
scene-representation-diffusion-model
Linear probe found representations of scene attributes in a text-to-image diffusion model
contrastiveexplanation
Contrastive Explanation (Foil Trees), developed at TNO/Utrecht University
bcf-iv
Package for heterogeneous causal effects in the presence of imperfect compliance (e.g., instrumental variables, fuzzy regression discontinuity designs)
https://github.com/csinva/clinical-rule-development
Building and vetting clinical decision rules.
https://github.com/betswish/mirage
Easy-to-use MIRAGE code for faithful answer attribution in RAG applications. Paper: https://aclanthology.org/2024.emnlp-main.347/